Joint optimization of an autoencoder for clustering and embedding

Research output: Journal contributionsJournal articlesResearchpeer-review

Authors

Deep embedded clustering has become a dominating approach to unsupervised categorization of objects with deep neural networks. The optimization of the most popular methods alternates between the training of a deep autoencoder and a k-means clustering of the autoencoder’s embedding. The diachronic setting, however, prevents the former to benefit from valuable information acquired by the latter. In this paper, we present an alternative where the autoencoder and the clustering are learned simultaneously. This is achieved by providing novel theoretical insight, where we show that the objective function of a certain class of Gaussian mixture models (GMM’s) can naturally be rephrased as the loss function of a one-hidden layer autoencoder thus inheriting the built-in clustering capabilities of the GMM. That simple neural network, referred to as the clustering module, can be integrated into a deep autoencoder resulting in a deep clustering model able to jointly learn a clustering and an embedding. Experiments confirm the equivalence between the clustering module and Gaussian mixture models. Further evaluations affirm the empirical relevance of our deep architecture as it outperforms related baselines on several data sets.

Original languageEnglish
JournalMachine Learning
Volume110
Issue number7
Pages (from-to)1901-1937
Number of pages37
ISSN0885-6125
DOIs
Publication statusPublished - 01.07.2021

Documents

DOI

Recently viewed

Publications

  1. Heuristic approximation and computational algorithms for closed networks
  2. Parsing Causal Models – An Instance Segmentation Approach
  3. Learning Rotation Sensitive Neural Network for Deformed Objects' Detection in Fisheye Images
  4. A two-step approach for the prediction of mood levels based on diary data
  5. Computational modeling of material flow networks
  6. XOperator - An extensible semantic agent for instant messaging networks
  7. The role of learners’ memory in app-based language instruction: the case of Duolingo.
  8. Using sequential injection analysis for fast determination of phosphate in coastal waters
  9. Primary Side Circuit Design of a Multi-coil Inductive System for Powering Wireless Sensors
  10. Trait correlation network analysis identifies biomass allocation traits and stem specific length as hub traits in herbaceous perennial plants
  11. Structure and dynamics laboratory testing of an indirectly controlled full variable valve train for camless engines
  12. Sharing in Christ's rule
  13. Soft Optimal Computing Methods to Identify Surface Roughness in Manufacturing Using a Monotonic Regressor
  14. Technological System and the Problem of Desymbolization
  15. Representation of Integration Profiles Using an Ontology
  16. On the Direct Kinematics Problem of Parallel Mechanisms
  17. Analysing Positional Data
  18. Two Readings of Bentham's Theory of Meaning as Applied to Moral and Political Discourse
  19. Managing technology as a virtual enterprise
  20. Model-based estimation of pesticides and transformation products and their export pathways in a headwater catchment
  21. Article 5 Contracts of carriage
  22. Algorithmic Catastrophe - the Revenge of Contingency
  23. Restoring Depleted Resources: Efficacy and Mechanisms of Change of an Internet-Based Unguided Recovery Training for Better Sleep and Psychological Detachment From Work