Joint optimization of an autoencoder for clustering and embedding
Research output: Journal contributions › Journal articles › Research › peer-review
Standard
In: Machine Learning, Vol. 110, No. 7, 01.07.2021, p. 1901-1937.
Research output: Journal contributions › Journal articles › Research › peer-review
Harvard
APA
Vancouver
Bibtex
}
RIS
TY - JOUR
T1 - Joint optimization of an autoencoder for clustering and embedding
AU - Boubekki, Ahcène
AU - Kampffmeyer, Michael
AU - Brefeld, Ulf
AU - Jenssen, Robert
PY - 2021/7/1
Y1 - 2021/7/1
N2 - Deep embedded clustering has become a dominating approach to unsupervised categorization of objects with deep neural networks. The optimization of the most popular methods alternates between the training of a deep autoencoder and a k-means clustering of the autoencoder’s embedding. The diachronic setting, however, prevents the former to benefit from valuable information acquired by the latter. In this paper, we present an alternative where the autoencoder and the clustering are learned simultaneously. This is achieved by providing novel theoretical insight, where we show that the objective function of a certain class of Gaussian mixture models (GMM’s) can naturally be rephrased as the loss function of a one-hidden layer autoencoder thus inheriting the built-in clustering capabilities of the GMM. That simple neural network, referred to as the clustering module, can be integrated into a deep autoencoder resulting in a deep clustering model able to jointly learn a clustering and an embedding. Experiments confirm the equivalence between the clustering module and Gaussian mixture models. Further evaluations affirm the empirical relevance of our deep architecture as it outperforms related baselines on several data sets.
AB - Deep embedded clustering has become a dominating approach to unsupervised categorization of objects with deep neural networks. The optimization of the most popular methods alternates between the training of a deep autoencoder and a k-means clustering of the autoencoder’s embedding. The diachronic setting, however, prevents the former to benefit from valuable information acquired by the latter. In this paper, we present an alternative where the autoencoder and the clustering are learned simultaneously. This is achieved by providing novel theoretical insight, where we show that the objective function of a certain class of Gaussian mixture models (GMM’s) can naturally be rephrased as the loss function of a one-hidden layer autoencoder thus inheriting the built-in clustering capabilities of the GMM. That simple neural network, referred to as the clustering module, can be integrated into a deep autoencoder resulting in a deep clustering model able to jointly learn a clustering and an embedding. Experiments confirm the equivalence between the clustering module and Gaussian mixture models. Further evaluations affirm the empirical relevance of our deep architecture as it outperforms related baselines on several data sets.
KW - Clustering
KW - Deep autoencoders
KW - Embedding
KW - Gaussian mixture models
KW - k-means
KW - Informatics
KW - Business informatics
UR - http://www.scopus.com/inward/record.url?scp=85109174419&partnerID=8YFLogxK
U2 - 10.1007/s10994-021-06015-5
DO - 10.1007/s10994-021-06015-5
M3 - Journal articles
AN - SCOPUS:85109174419
VL - 110
SP - 1901
EP - 1937
JO - Machine Learning
JF - Machine Learning
SN - 0885-6125
IS - 7
ER -