Skip to content

Deep cluster

polars_ts.clustering.deep_cluster

Deep Embedded Clustering (DEC) and Improved DEC (IDEC).

Pretrains a convolutional autoencoder on reconstruction, then fine-tunes with a KL-divergence clustering loss. IDEC additionally keeps the reconstruction loss during fine-tuning.

References

  • Xie et al. (2016). Unsupervised Deep Embedding for Clustering Analysis. ICML.
  • Guo et al. (2017). Improved Deep Embedded Clustering with Local Structure Preservation. IJCAI.
  • Autoencoder-based Deep Clustering Survey (2025). arXiv:2504.02087.

DECClusterer

Deep Embedded Clustering for time series.

Two-phase training: 1. Pretrain autoencoder with MSE reconstruction loss. 2. Fine-tune encoder + clustering layer with KL divergence loss.

Parameters

n_clusters Number of clusters. embedding_dim Bottleneck embedding dimension. n_filters Base CNN filter count. pretrain_epochs Epochs for autoencoder pretraining. finetune_epochs Epochs for clustering fine-tuning. lr Learning rate. batch_size Training batch size. seed Random seed. id_col, target_col Column names.

fit(df)

Pretrain autoencoder, then fine-tune with clustering loss.

_prepare_data(df)

Extract, pad, and normalize series.

_pretrain(ae, X_t)

Pretrain autoencoder with reconstruction loss.

_finetune(ae, cl, X_t)

Fine-tune encoder + clustering layer with KL divergence.

_kmeans_centroids(X, k, seed, max_iter=100) staticmethod

Run k-means and return centroids.

IDECClusterer

Bases: DECClusterer

Improved Deep Embedded Clustering for time series.

Like DEC but keeps the reconstruction loss during fine-tuning, weighted by gamma.

Parameters

gamma Weight of reconstruction loss during fine-tuning. Default 0.1.

All other parameters are inherited from :class:DECClusterer.

_finetune(ae, cl, X_t)

Fine-tune with KL divergence + reconstruction loss.

dec_cluster(df, k, pretrain_epochs=50, finetune_epochs=50, embedding_dim=64, n_filters=32, seed=42, id_col='unique_id', target_col='y', **kwargs)

DEC clustering convenience function.

Returns

pl.DataFrame DataFrame with columns [id_col, "cluster"].

idec_cluster(df, k, pretrain_epochs=50, finetune_epochs=50, embedding_dim=64, n_filters=32, gamma=0.1, seed=42, id_col='unique_id', target_col='y', **kwargs)

IDEC clustering convenience function.

Returns

pl.DataFrame DataFrame with columns [id_col, "cluster"].