Joint optimization of an autoencoder for clustering and embedding

Publikation: Beiträge in ZeitschriftenZeitschriftenaufsätzeForschungbegutachtet

Standard

Joint optimization of an autoencoder for clustering and embedding. / Boubekki, Ahcène; Kampffmeyer, Michael; Brefeld, Ulf et al.
in: Machine Learning, Jahrgang 110, Nr. 7, 01.07.2021, S. 1901-1937.

Publikation: Beiträge in ZeitschriftenZeitschriftenaufsätzeForschungbegutachtet

Harvard

APA

Vancouver

Boubekki A, Kampffmeyer M, Brefeld U, Jenssen R. Joint optimization of an autoencoder for clustering and embedding. Machine Learning. 2021 Jul 1;110(7):1901-1937. doi: 10.1007/s10994-021-06015-5

Bibtex

@article{5d22629599494a47993f95b18dfd860c,
title = "Joint optimization of an autoencoder for clustering and embedding",
abstract = "Deep embedded clustering has become a dominating approach to unsupervised categorization of objects with deep neural networks. The optimization of the most popular methods alternates between the training of a deep autoencoder and a k-means clustering of the autoencoder{\textquoteright}s embedding. The diachronic setting, however, prevents the former to benefit from valuable information acquired by the latter. In this paper, we present an alternative where the autoencoder and the clustering are learned simultaneously. This is achieved by providing novel theoretical insight, where we show that the objective function of a certain class of Gaussian mixture models (GMM{\textquoteright}s) can naturally be rephrased as the loss function of a one-hidden layer autoencoder thus inheriting the built-in clustering capabilities of the GMM. That simple neural network, referred to as the clustering module, can be integrated into a deep autoencoder resulting in a deep clustering model able to jointly learn a clustering and an embedding. Experiments confirm the equivalence between the clustering module and Gaussian mixture models. Further evaluations affirm the empirical relevance of our deep architecture as it outperforms related baselines on several data sets.",
keywords = "Clustering, Deep autoencoders, Embedding, Gaussian mixture models, k-means, Informatics, Business informatics",
author = "Ahc{\`e}ne Boubekki and Michael Kampffmeyer and Ulf Brefeld and Robert Jenssen",
year = "2021",
month = jul,
day = "1",
doi = "10.1007/s10994-021-06015-5",
language = "English",
volume = "110",
pages = "1901--1937",
journal = "Machine Learning",
issn = "0885-6125",
publisher = "Springer Netherlands",
number = "7",

}

RIS

TY - JOUR

T1 - Joint optimization of an autoencoder for clustering and embedding

AU - Boubekki, Ahcène

AU - Kampffmeyer, Michael

AU - Brefeld, Ulf

AU - Jenssen, Robert

PY - 2021/7/1

Y1 - 2021/7/1

N2 - Deep embedded clustering has become a dominating approach to unsupervised categorization of objects with deep neural networks. The optimization of the most popular methods alternates between the training of a deep autoencoder and a k-means clustering of the autoencoder’s embedding. The diachronic setting, however, prevents the former to benefit from valuable information acquired by the latter. In this paper, we present an alternative where the autoencoder and the clustering are learned simultaneously. This is achieved by providing novel theoretical insight, where we show that the objective function of a certain class of Gaussian mixture models (GMM’s) can naturally be rephrased as the loss function of a one-hidden layer autoencoder thus inheriting the built-in clustering capabilities of the GMM. That simple neural network, referred to as the clustering module, can be integrated into a deep autoencoder resulting in a deep clustering model able to jointly learn a clustering and an embedding. Experiments confirm the equivalence between the clustering module and Gaussian mixture models. Further evaluations affirm the empirical relevance of our deep architecture as it outperforms related baselines on several data sets.

AB - Deep embedded clustering has become a dominating approach to unsupervised categorization of objects with deep neural networks. The optimization of the most popular methods alternates between the training of a deep autoencoder and a k-means clustering of the autoencoder’s embedding. The diachronic setting, however, prevents the former to benefit from valuable information acquired by the latter. In this paper, we present an alternative where the autoencoder and the clustering are learned simultaneously. This is achieved by providing novel theoretical insight, where we show that the objective function of a certain class of Gaussian mixture models (GMM’s) can naturally be rephrased as the loss function of a one-hidden layer autoencoder thus inheriting the built-in clustering capabilities of the GMM. That simple neural network, referred to as the clustering module, can be integrated into a deep autoencoder resulting in a deep clustering model able to jointly learn a clustering and an embedding. Experiments confirm the equivalence between the clustering module and Gaussian mixture models. Further evaluations affirm the empirical relevance of our deep architecture as it outperforms related baselines on several data sets.

KW - Clustering

KW - Deep autoencoders

KW - Embedding

KW - Gaussian mixture models

KW - k-means

KW - Informatics

KW - Business informatics

UR - http://www.scopus.com/inward/record.url?scp=85109174419&partnerID=8YFLogxK

U2 - 10.1007/s10994-021-06015-5

DO - 10.1007/s10994-021-06015-5

M3 - Journal articles

AN - SCOPUS:85109174419

VL - 110

SP - 1901

EP - 1937

JO - Machine Learning

JF - Machine Learning

SN - 0885-6125

IS - 7

ER -

Dokumente

DOI

Zuletzt angesehen

Aktivitäten

  1. Applications of transfer operator methods in fluid dynamics
  2. Coherent sets in nonautonomous dynamics
  3. Do connectives improve the level of understandability in mathematical modeling tasks?
  4. How do pre-service teachers analyze classroom lessons? Different patterns of written analysis and effects of direct instructional or problem-oriented learning environments.
  5. Transcribing Nonverbal Elements in the Classroom
  6. Laboratory Tests with various Oxidants.
  7. Chances and problems of feedback in every-day teaching of mathematical modelling
  8. Experiences on the theme of actions for sustainable development in the field of educational systems, together with Ute Stoltenberg
  9. Computational study of Lagrangian transport in turbulent convection
  10. Evaluation of tension-compression asymmetry in nanocrystalline PdAu using a Drucker-Prager type constitutive model.
  11. Does it occur or not? - A structured approach to support students in determining the spontaneity of chemical reactions
  12. Kunstuniversität Linz
  13. Strategies for the Control of Water Treatment by Stripping and Injection of CO2
  14. European Conference on Information Systems 2023 (Veranstaltung)
  15. Tourist Spaces as Heterotopias. Touristification and the Generation of Identity as Inter-connected Logics of Action
  16. Determinants of Materiality Disclosure Quality in Integrated Reporting: Empirical Evidence from an International Setting
  17. International Symposium on Learning Materials and Instruction - 2019
  18. Knowing Colour - 2018/19
  19. Conference on Controversy 2010
  20. Simulierte oder gelebte Kollaboration
  21. Preliminary results of the state of the art study on scaling and corrosion.
  22. Approaches to Provenance
  23. Third Annual Conference on International Intellectual Property Law and Policy
  24. Calculated Publics
  25. European Multigrid Conference 1999
  26. The Influence of Reputation on Travel Desicions in the Internet
  27. Material Performance - by Olympia Bukkakis with Dagat Mera
  28. National Institute for Experimental Arts (Externe Organisation)
  29. Conflict of Interest in Central and Eastern Europe
  30. Gottfried Benn, „Mann und Frau gehn durch die Krebsbaracke“
  31. 5. Workshop Makroökonomik und Konjunktur - 2010
  32. Trinitarische Eschtologie
  33. European Conference on Information Systems - ECIS 2024 (Veranstaltung)
  34. Journal of Service Research (Fachzeitschrift)
  35. Location Based Dataveillance
  36. SASE 2021 – Virtual Conference
  37. 27th Association for Psychological Science Annual Convention - APS 2015