Joint optimization of an autoencoder for clustering and embedding

Research output: Journal contributionsJournal articlesResearchpeer-review

Standard

Joint optimization of an autoencoder for clustering and embedding. / Boubekki, Ahcène; Kampffmeyer, Michael; Brefeld, Ulf et al.
In: Machine Learning, Vol. 110, No. 7, 01.07.2021, p. 1901-1937.

Research output: Journal contributionsJournal articlesResearchpeer-review

Harvard

APA

Vancouver

Boubekki A, Kampffmeyer M, Brefeld U, Jenssen R. Joint optimization of an autoencoder for clustering and embedding. Machine Learning. 2021 Jul 1;110(7):1901-1937. doi: 10.1007/s10994-021-06015-5

Bibtex

@article{5d22629599494a47993f95b18dfd860c,
title = "Joint optimization of an autoencoder for clustering and embedding",
abstract = "Deep embedded clustering has become a dominating approach to unsupervised categorization of objects with deep neural networks. The optimization of the most popular methods alternates between the training of a deep autoencoder and a k-means clustering of the autoencoder{\textquoteright}s embedding. The diachronic setting, however, prevents the former to benefit from valuable information acquired by the latter. In this paper, we present an alternative where the autoencoder and the clustering are learned simultaneously. This is achieved by providing novel theoretical insight, where we show that the objective function of a certain class of Gaussian mixture models (GMM{\textquoteright}s) can naturally be rephrased as the loss function of a one-hidden layer autoencoder thus inheriting the built-in clustering capabilities of the GMM. That simple neural network, referred to as the clustering module, can be integrated into a deep autoencoder resulting in a deep clustering model able to jointly learn a clustering and an embedding. Experiments confirm the equivalence between the clustering module and Gaussian mixture models. Further evaluations affirm the empirical relevance of our deep architecture as it outperforms related baselines on several data sets.",
keywords = "Clustering, Deep autoencoders, Embedding, Gaussian mixture models, k-means, Informatics, Business informatics",
author = "Ahc{\`e}ne Boubekki and Michael Kampffmeyer and Ulf Brefeld and Robert Jenssen",
year = "2021",
month = jul,
day = "1",
doi = "10.1007/s10994-021-06015-5",
language = "English",
volume = "110",
pages = "1901--1937",
journal = "Machine Learning",
issn = "0885-6125",
publisher = "Springer Netherlands",
number = "7",

}

RIS

TY - JOUR

T1 - Joint optimization of an autoencoder for clustering and embedding

AU - Boubekki, Ahcène

AU - Kampffmeyer, Michael

AU - Brefeld, Ulf

AU - Jenssen, Robert

PY - 2021/7/1

Y1 - 2021/7/1

N2 - Deep embedded clustering has become a dominating approach to unsupervised categorization of objects with deep neural networks. The optimization of the most popular methods alternates between the training of a deep autoencoder and a k-means clustering of the autoencoder’s embedding. The diachronic setting, however, prevents the former to benefit from valuable information acquired by the latter. In this paper, we present an alternative where the autoencoder and the clustering are learned simultaneously. This is achieved by providing novel theoretical insight, where we show that the objective function of a certain class of Gaussian mixture models (GMM’s) can naturally be rephrased as the loss function of a one-hidden layer autoencoder thus inheriting the built-in clustering capabilities of the GMM. That simple neural network, referred to as the clustering module, can be integrated into a deep autoencoder resulting in a deep clustering model able to jointly learn a clustering and an embedding. Experiments confirm the equivalence between the clustering module and Gaussian mixture models. Further evaluations affirm the empirical relevance of our deep architecture as it outperforms related baselines on several data sets.

AB - Deep embedded clustering has become a dominating approach to unsupervised categorization of objects with deep neural networks. The optimization of the most popular methods alternates between the training of a deep autoencoder and a k-means clustering of the autoencoder’s embedding. The diachronic setting, however, prevents the former to benefit from valuable information acquired by the latter. In this paper, we present an alternative where the autoencoder and the clustering are learned simultaneously. This is achieved by providing novel theoretical insight, where we show that the objective function of a certain class of Gaussian mixture models (GMM’s) can naturally be rephrased as the loss function of a one-hidden layer autoencoder thus inheriting the built-in clustering capabilities of the GMM. That simple neural network, referred to as the clustering module, can be integrated into a deep autoencoder resulting in a deep clustering model able to jointly learn a clustering and an embedding. Experiments confirm the equivalence between the clustering module and Gaussian mixture models. Further evaluations affirm the empirical relevance of our deep architecture as it outperforms related baselines on several data sets.

KW - Clustering

KW - Deep autoencoders

KW - Embedding

KW - Gaussian mixture models

KW - k-means

KW - Informatics

KW - Business informatics

UR - http://www.scopus.com/inward/record.url?scp=85109174419&partnerID=8YFLogxK

U2 - 10.1007/s10994-021-06015-5

DO - 10.1007/s10994-021-06015-5

M3 - Journal articles

AN - SCOPUS:85109174419

VL - 110

SP - 1901

EP - 1937

JO - Machine Learning

JF - Machine Learning

SN - 0885-6125

IS - 7

ER -

Documents

DOI

Recently viewed

Publications

  1. Using sequential injection analysis to improve system and data reliability of online methods
  2. Heuristic approximation and computational algorithms for closed networks
  3. What is learned in approach-avoidance tasks? On the scope and generalizability of approach-avoidance effects
  4. Wavelet based Fault Detection and RLS Parameter Estimation of Conductive Fibers with a Simultaneous Estimation of Time-Varying Disturbance
  5. Development of a Didactic Graphical Simulation Interface on MATLAB for Systems Control
  6. Dynamic adjustment of dispatching rule parameters in flow shops with sequence-dependent set-up times
  7. Emergency detection based on probabilistic modeling in AAL environments
  8. Integrating Mobile Devices into AAL-Environments using Knowledge based Assistance Systems
  9. Multidimensional recurrence quantification analysis (MdRQA) for the analysis of multidimensional time-series
  10. Control of the inverse pendulum based on sliding mode and model predictive control
  11. Situated multiplying in primary school
  12. Constructions and Reconstructions. The Architectural Image between Rendering and Photography
  13. Dimension estimates for certain sets of infinite complex continued fractions
  14. Problem solving in mathematics education
  15. Implementing aspects of inquiry-based learning in secondary chemistry classes: a case study
  16. Gamma GAMM applied on tree growth data
  17. Development of a Parameterized Model for Additively Manufactured Dies to Control the Strains in Extrudates
  18. Understanding the modes of use and availability of critical metals-An expert-based scenario analysis for the case of indium
  19. Pluralism and diversity: Trends in the use and application of ordination methods 1990-2007
  20. Bifactor Models for Predicting Criteria by General and Specific Factors
  21. Proxy Indicators for the Quality of Open-domain Dialogues
  22. Requests for mathematical reasoning in textbooks for primary-level students
  23. Combining flatness based feedforward action with a fractional PI regulator to control the intake valve engine
  24. Clashing Values
  25. Entangled – But How?
  26. A highly transparent method of assessing the contribution of incentives to meet various technical challenges in distributed energy systems
  27. Short-arc measurement and fitting based on the bidirectional prediction of observed data
  28. Exploring the Uncanny-Valley-Effect in Affective Human-Robot Interaction
  29. Stressing the Relevance of Differentiating between Systematic and Random Measurement Errors in Ultrasound Muscle Thickness Diagnostics
  30. Application of design of experiments for laser shock peening process optimization
  31. I&EC 18-Small particle size magnesium in one-pot Grignard-Zerewitinoff reactions: Kinetics of and practical application to reductive dechlorination of persistent organic pollutants
  32. Statistical precipitation bias correction of gridded model data using point measurements
  33. Analytic reproducibility in articles receiving open data badges at the journal Psychological Science
  34. Optimal trajectory generation for camless internal combustion engine valve control
  35. Integrating teacher and student workspaces in a technology-enhanced mathematics lecture
  36. rSOESGOPE Method Applied to Four-Tank System Modeling