Coresets for Archetypal Analysis

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review


Archetypal analysis represents instances as linear mixtures of prototypes (the archetypes) that lie on the boundary of the convex hull of the data. Archetypes are thus often better interpretable than factors computed by other matrix factorization techniques. However, the interpretability comes with high computational cost due to additional convexity-preserving constraints. In this paper, we propose efficient coresets for archetypal analysis. Theoretical guarantees are derived by showing that quantization errors of k-means upper bound archetypal analysis; the computation of a provable absolute-coreset can be performed in only two passes over the data. Empirically, we show that the coresets lead to improved performance on several data sets.
Original languageEnglish
Title of host publication32rd Conference on Neural Information Processing Systems (NeurIPS 2019) : Vancouver, Canada, 8-14 December 2019
EditorsHanna Wallach, Hugo Larochelle
Number of pages9
Place of PublicationRed Hook
Publication date2020
ISBN (Print)978-1-71380-793-3
Publication statusPublished - 2020
Event33rd Conference on Neural Information Processing Systems - 2019 - Vancouver Convention Center, Vancouver, Canada
Duration: 08.12.201914.12.2019
Conference number: 33

Bibliographical note

Richtige Zählung der Konferenz: 33rd Conference on Neural Information Processing Systems.
Copyright©(2019) by individual authors and Neural Information Processing Systems Foundation Inc. Printed with permission by Curran Associates, Inc. (2020)