Rl env

`polars_ts.adapters.rl_env`

Gymnasium-compatible RL environment for forecast-based decision making.

`ForecastEnv`

A gymnasium-like environment wrapping a polars-ts forecast pipeline.

At each step, the agent observes recent time series values and a forecast, then takes an action (e.g. inventory order, trading signal). The reward is computed from a configurable reward function.

Parameters

data Numpy array of shape (n_steps,) with the actual time series values. forecasts Numpy array of shape (n_steps,) with forecast values. window_size Number of recent observations provided as the observation. reward_fn Callable (action, actual, forecast) -> float. Defaults to negative absolute error: -|actual - action|.

`reset()`

Reset the environment. Return the initial observation.

`step(action)`

Take one step.

Parameters

action The agent's decision for this timestep.

Returns

tuple (observation, reward, done, info)

`_get_obs()`

Build observation: recent values + current forecast.