Skip to content

Env

polars_ts.marl.env

Portfolio environment for multi-agent RL decision making.

PortfolioEnv

Gymnasium-like environment for portfolio allocation.

At each step, the agent observes recent returns for all assets and produces portfolio weights. The reward is the portfolio return minus transaction costs.

Parameters

returns Array of shape (n_steps, n_assets) with per-period returns. window_size Number of recent return periods provided as the observation. transaction_cost Proportional cost applied to weight changes between steps.

from_dataframe(df, window_size=10, transaction_cost=0.0, id_col='unique_id', time_col='ds', target_col='y') classmethod

Create environment from a polars panel DataFrame of prices.

Converts prices to log returns and pivots to wide format.

reset()

Reset and return initial observation of shape (window_size, n_assets).

step(action)

Take one step with portfolio weights.

Parameters

action Raw portfolio weights (will be normalized to sum to 1).

Returns

tuple (observation, reward, done, info)

_get_obs()

Build observation: recent returns window of shape (window_size, n_assets).