Skip to content

Splits

polars_ts.validation.splits

Time series cross-validation splitters (group-aware, temporal).

expanding_window_cv(df, n_splits=5, horizon=1, step=1, gap=0, id_col='unique_id', time_col='ds')

Expand-window time series cross-validation.

The training window starts at the beginning and grows by step time steps each fold. The test window is always horizon steps.

Parameters

df Input DataFrame with time series data. n_splits Number of (train, test) folds to generate. horizon Number of time steps in each test fold. step Number of time steps the split point advances between folds. gap Number of time steps between training end and test start. id_col Column identifying each time series. time_col Column with timestamps for ordering.

Yields

tuple[pl.DataFrame, pl.DataFrame] (train_df, test_df) for each fold.

sliding_window_cv(df, n_splits=5, train_size=10, horizon=1, step=1, gap=0, id_col='unique_id', time_col='ds')

Slide fixed-size window time series cross-validation.

Parameters

df Input DataFrame. n_splits Number of folds. train_size Fixed number of time steps in each training window. horizon Number of time steps in each test fold. step How many time steps to advance between folds. gap Gap between train end and test start. id_col Column identifying each time series. time_col Column with timestamps for ordering.

Yields

tuple[pl.DataFrame, pl.DataFrame] (train_df, test_df) for each fold.

rolling_origin_cv(df, n_splits=5, initial_train_size=None, horizon=1, step=1, gap=0, fixed_train_size=None, id_col='unique_id', time_col='ds')

Perform general rolling-origin cross-validation.

Configurable as expanding (fixed_train_size=None) or sliding (fixed_train_size=k) window CV.

Parameters

df Input DataFrame. n_splits Number of folds. initial_train_size Minimum time steps for the first training fold. If None, computed automatically from other parameters. horizon Test window size in time steps. step Advance between successive split points. gap Gap between train and test. fixed_train_size If set, training window is always this size (sliding). If None, training window expands. id_col Column identifying each time series. time_col Column with timestamps for ordering.

Yields

tuple[pl.DataFrame, pl.DataFrame] (train_df, test_df) for each fold.

_rolling_origin_splits(df, n_splits, horizon, step, gap, fixed_train_size=None, initial_train_size=None, id_col='unique_id', time_col='ds')

Produce n_splits (train, test) folds from a time series DataFrame.