Joint optimization of an autoencoder for clustering and embedding

Ahcène Boubekki; Michael Kampffmeyer; Ulf Brefeld; Robert Jenssen

doi:10.1007/s10994-021-06015-5

Joint optimization of an autoencoder for clustering and embedding

Research output: Journal contributions › Journal articles › Research › peer-review

Authors

Ahcène Boubekki
Michael Kampffmeyer
Ulf Brefeld
Robert Jenssen

Deep embedded clustering has become a dominating approach to unsupervised categorization of objects with deep neural networks. The optimization of the most popular methods alternates between the training of a deep autoencoder and a k-means clustering of the autoencoder’s embedding. The diachronic setting, however, prevents the former to benefit from valuable information acquired by the latter. In this paper, we present an alternative where the autoencoder and the clustering are learned simultaneously. This is achieved by providing novel theoretical insight, where we show that the objective function of a certain class of Gaussian mixture models (GMM’s) can naturally be rephrased as the loss function of a one-hidden layer autoencoder thus inheriting the built-in clustering capabilities of the GMM. That simple neural network, referred to as the clustering module, can be integrated into a deep autoencoder resulting in a deep clustering model able to jointly learn a clustering and an embedding. Experiments confirm the equivalence between the clustering module and Gaussian mixture models. Further evaluations affirm the empirical relevance of our deep architecture as it outperforms related baselines on several data sets.

Original language	English
Journal	Machine Learning
Volume	110
Issue number	7
Pages (from-to)	1901-1937
Number of pages	37
ISSN	0885-6125
DOIs	https://doi.org/10.1007/s10994-021-06015-5
Publication status	Published - 01.07.2021

ASJC Scopus Subject Areas

Artificial Intelligence
Software

Research areas

Clustering, Deep autoencoders, Embedding, Gaussian mixture models, k-means
Informatics
Business informatics

Related by journal

Interactive sequential generative models for team sports

Fassmeyer, D., Cordes, M. & Brefeld, U., 02.2025, In: Machine Learning. 114, 2, 15 p., 38.

Research output: Journal contributions › Journal articles › Research › peer-review

Masked autoencoder for multiagent trajectories

Rudolph, Y. & Brefeld, U., 02.2025, In: Machine Learning. 114, 2, 18 p., 44.

Research output: Journal contributions › Journal articles › Research › peer-review

Probabilistic movement models and zones of control

Brefeld, U., Lasek, J. & Mair, S., 15.01.2019, In: Machine Learning. 108, 1, p. 127-147 21 p.

Research output: Journal contributions › Journal articles › Research › peer-review

Spatio-Temporal Convolution Kernels

Knauf, K., Memmert, D. & Brefeld, U., 01.02.2016, In: Machine Learning. 102, 2, p. 247-273 27 p.

Research output: Journal contributions › Journal articles › Research › peer-review

Other publications by the same author(s)

Interactive sequential generative models for team sports

Fassmeyer, D., Cordes, M. & Brefeld, U., 02.2025, In: Machine Learning. 114, 2, 15 p., 38.

Research output: Journal contributions › Journal articles › Research › peer-review

Joint Item Response Models for Manual and Automatic Scores on Open-Ended Test Items

Bengs, D., Brefeld, U., Kroehne, U. & Zehner, F., 2025, (Accepted/In press) In: Psychometrika.

Research output: Journal contributions › Journal articles › Research › peer-review

Machine Learning and Data Mining for Sports Analytics: 11th International Workshop, MLSA 2024, Vilnius, Lithuania, September 9, 2024, Revised Selected Papers

Brefeld, U. (Editor), Davis, J. (Editor), Van Haaren, J. (Editor) & Zimmermann, A. (Editor), 2025, Cham: Springer Verlag. 119 p. (Communications in Computer and Information Science; vol. 2460)

Research output: Books and anthologies › Conference proceedings › Research

Masked autoencoder for multiagent trajectories

Rudolph, Y. & Brefeld, U., 02.2025, In: Machine Learning. 114, 2, 18 p., 44.

Research output: Journal contributions › Journal articles › Research › peer-review

Self-improvement for Computerized Adaptive Testing

Rudolph, Y., Neubauer, K. & Brefeld, U., 2026, Machine Learning and Knowledge Discovery in Databases - Research Track: European Conference, ECML PKDD 2025, Porto, Portugal, September 15–19, 2025, Proceedings. Ribeiro, R. P., Jorge, A. M., Soares, C., Gama, J., Pfahringer, B., Japkowicz, N., Larrañaga, P. & Abreu, P. H. (eds.). Cham: Springer International Publishing, Vol. 2. p. 70-86 17 p. (Lecture Notes in Computer Science; vol. 16014 LNCS).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

Documents

Download
3.98 MB, PDF document

DOI

https://doi.org/10.1007/s10994-021-06015-5
Final published version