Deep cluster
polars_ts.clustering.deep_cluster
Deep Embedded Clustering (DEC) and Improved DEC (IDEC).
Pretrains a convolutional autoencoder on reconstruction, then fine-tunes with a KL-divergence clustering loss. IDEC additionally keeps the reconstruction loss during fine-tuning.
References
- Xie et al. (2016). Unsupervised Deep Embedding for Clustering Analysis. ICML.
- Guo et al. (2017). Improved Deep Embedded Clustering with Local Structure Preservation. IJCAI.
- Autoencoder-based Deep Clustering Survey (2025). arXiv:2504.02087.
DECClusterer
Deep Embedded Clustering for time series.
Two-phase training: 1. Pretrain autoencoder with MSE reconstruction loss. 2. Fine-tune encoder + clustering layer with KL divergence loss.
Parameters
n_clusters Number of clusters. embedding_dim Bottleneck embedding dimension. n_filters Base CNN filter count. pretrain_epochs Epochs for autoencoder pretraining. finetune_epochs Epochs for clustering fine-tuning. lr Learning rate. batch_size Training batch size. seed Random seed. id_col, target_col Column names.
fit(df)
Pretrain autoencoder, then fine-tune with clustering loss.
_prepare_data(df)
Extract, pad, and normalize series.
_pretrain(ae, X_t)
Pretrain autoencoder with reconstruction loss.
_finetune(ae, cl, X_t)
Fine-tune encoder + clustering layer with KL divergence.
_kmeans_centroids(X, k, seed, max_iter=100)
staticmethod
Run k-means and return centroids.
IDECClusterer
Bases: DECClusterer
Improved Deep Embedded Clustering for time series.
Like DEC but keeps the reconstruction loss during fine-tuning,
weighted by gamma.
Parameters
gamma Weight of reconstruction loss during fine-tuning. Default 0.1.
All other parameters are inherited from :class:DECClusterer.
_finetune(ae, cl, X_t)
Fine-tune with KL divergence + reconstruction loss.
dec_cluster(df, k, pretrain_epochs=50, finetune_epochs=50, embedding_dim=64, n_filters=32, seed=42, id_col='unique_id', target_col='y', **kwargs)
DEC clustering convenience function.
Returns
pl.DataFrame
DataFrame with columns [id_col, "cluster"].
idec_cluster(df, k, pretrain_epochs=50, finetune_epochs=50, embedding_dim=64, n_filters=32, gamma=0.1, seed=42, id_col='unique_id', target_col='y', **kwargs)
IDEC clustering convenience function.
Returns
pl.DataFrame
DataFrame with columns [id_col, "cluster"].