Skip to content

Kshape

polars_ts.clustering.kshape

k-Shape clustering for time series using Shape-Based Distance (SBD).

KShape

k-Shape clustering for time series.

Uses Shape-Based Distance (SBD) and shape extraction to iteratively refine cluster centroids and assignments.

Parameters:

Name Type Description Default
n_clusters int

Number of clusters. Default 2.

2
max_iter int

Maximum number of iterations. Default 100.

100

Examples:

>>> ks = KShape(n_clusters=3)
>>> ks.fit(df)
>>> ks.labels_

fit(df)

Fit k-Shape clustering.

Parameters:

Name Type Description Default
df DataFrame

DataFrame with columns unique_id and y.

required

Returns:

Type Description
KShape

self, with labels_ and centroids_ populated.

_zscore(x)

Z-normalize a 1-D array. Returns zeros if std is zero.

_sbd(x, y)

Shape-Based Distance between two z-normalized series.

Returns (distance, y_shifted) where y_shifted is y aligned to x.

_shape_extraction(cluster_series, length)

Extract centroid shape via eigenvalue decomposition (Paparrizos & Gravano).

Computes the first eigenvector of the cross-correlation matrix S^T * S, which maximizes the sum of squared normalized cross-correlations.