Skip to content

Kmedoids

polars_ts.clustering.kmedoids

K-Medoids (PAM) clustering for time series using precomputed distances.

Delegates the PAM swap loop to Rust when available (10-30x faster), falling back to pure Python otherwise.

TimeSeriesKMedoids

K-Medoids (PAM) time series clustering.

Parameters

n_clusters Number of clusters. Default 2. metric Distance metric name (e.g. "dtw", "erp", "lcss"). Default "dtw". max_iter Maximum swap iterations. Default 100. seed Random seed for initial medoid selection. Default 42. **distance_kwargs Extra keyword arguments forwarded to the distance function.

fit(df)

Fit k-medoids clustering.

Parameters

df DataFrame with columns unique_id and y.

Returns

self

_build_dist_matrix(dist_dict, str_ids)

Convert distance dict to flat n×n row-major matrix.

_kmedoids_rust(dist_dict, str_ids, k, max_iter, seed)

Run PAM via Rust extension.

_kmedoids_python(dist_dict, str_ids, k, max_iter, seed)

Pure-Python PAM swap fallback.

kmedoids(df, k, method='dtw', max_iter=100, seed=42, id_col='unique_id', target_col='y', **distance_kwargs)

K-Medoids (PAM) clustering over time series.

Parameters

df DataFrame with columns id_col and target_col. k Number of clusters. method Distance metric name (e.g. "dtw", "erp", "lcss"). max_iter Maximum swap iterations. seed Random seed for initial medoid selection. id_col Column identifying each time series. target_col Column with the time series values. **distance_kwargs Extra keyword arguments forwarded to the distance function.

Returns

pl.DataFrame DataFrame with columns [id_col, "cluster"].