Joint optimization of an autoencoder for clustering and embedding

Research output: Journal contributionsJournal articlesResearchpeer-review

Authors

Deep embedded clustering has become a dominating approach to unsupervised categorization of objects with deep neural networks. The optimization of the most popular methods alternates between the training of a deep autoencoder and a k-means clustering of the autoencoder’s embedding. The diachronic setting, however, prevents the former to benefit from valuable information acquired by the latter. In this paper, we present an alternative where the autoencoder and the clustering are learned simultaneously. This is achieved by providing novel theoretical insight, where we show that the objective function of a certain class of Gaussian mixture models (GMM’s) can naturally be rephrased as the loss function of a one-hidden layer autoencoder thus inheriting the built-in clustering capabilities of the GMM. That simple neural network, referred to as the clustering module, can be integrated into a deep autoencoder resulting in a deep clustering model able to jointly learn a clustering and an embedding. Experiments confirm the equivalence between the clustering module and Gaussian mixture models. Further evaluations affirm the empirical relevance of our deep architecture as it outperforms related baselines on several data sets.

Original languageEnglish
JournalMachine Learning
Volume110
Issue number7
Pages (from-to)1901-1937
Number of pages37
ISSN0885-6125
DOIs
Publication statusPublished - 01.07.2021

Documents

DOI

Recently viewed

Publications

  1. Joint optimization of an autoencoder for clustering and embedding
  2. Median based algorithm as an entropy function for noise detectionin wavelet trees for data reconciliation
  3. Prediction of the tool change point in a polishing process using a modular software framework
  4. Preventive Emergency Detection Based on the Probabilistic Evaluation of Distributed, Embedded Sensor Networks
  5. Heuristic approximation and computational algorithms for closed networks
  6. Parsing Causal Models – An Instance Segmentation Approach
  7. Using haar wavelets for fault detection in technical processes
  8. Detection and mapping of water pollution variation in the Nile Delta using multivariate clustering and GIS techniques
  9. Computational modeling of material flow networks
  10. Inversion of Fuzzy Neural Networks for the Reduction of Noise in the Control Loop for Automotive Applications
  11. Wavelet based Fault Detection and RLS Parameter Estimation of Conductive Fibers with a Simultaneous Estimation of Time-Varying Disturbance
  12. ACL–adaptive correction of learning parameters for backpropagation based algorithms
  13. Finding Similar Movements in Positional Data Streams
  14. Learning Rotation Sensitive Neural Network for Deformed Objects' Detection in Fisheye Images
  15. A two-step approach for the prediction of mood levels based on diary data
  16. Modeling and Performance Analysis of a Node in Fault Tolerant Wireless Sensor Networks
  17. Evaluating OWL 2 reasoners in the context of checking entity-relationship diagrams during software development
  18. Using trait-based filtering as a predictive framework for conservation
  19. A Multivariate Method for Dynamic System Analysis
  20. Authenticity and authentication in language learning
  21. Supervised clustering of streaming data for email batch detection
  22. Modified dynamic programming approach for offline segmentation of long hydrometeorological time series
  23. A geometric algorithm for the output functional controllability in general manipulation systems and mechanisms
  24. Analysis of Complexity Reduction in Kalman Filters Through Decoupling Control With Chattered Inputs in PMSM
  25. Substructure, subgraph, and walk counts as measures of the complexity of graphs and molecules.
  26. Modeling precipitation kinetics for multi-phase and multi-component systems using particle size distributions via a moving grid technique
  27. Homogenization modeling of thin-layer-type microstructures
  28. Multi-view learning with dependent views
  29. Machine Learning and Knowledge Discovery in Databases
  30. Model inversion using fuzzy neural network with boosting of the solution
  31. Using Complexity Metrics to Assess Silent Reading Fluency
  32. Comparing the Sensitivity of Social Networks, Web Graphs, and Random Graphs with Respect to Vertex Removal