Coresets for Archetypal Analysis

Sebastian Mair; Ulf Brefeld

Coresets for Archetypal Analysis

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

Authors

Professorship for Information Systems, in particular Machine Learning

Archetypal analysis represents instances as linear mixtures of prototypes (the archetypes) that lie on the boundary of the convex hull of the data. Archetypes are thus often better interpretable than factors computed by other matrix factorization techniques. However, the interpretability comes with high computational cost due to additional convexity-preserving constraints. In this paper, we propose efficient coresets for archetypal analysis. Theoretical guarantees are derived by showing that quantization errors of k-means upper bound archetypal analysis; the computation of a provable absolute-coreset can be performed in only two passes over the data. Empirically, we show that the coresets lead to improved performance on several data sets.

Original language	English
Title of host publication	32rd Conference on Neural Information Processing Systems (NeurIPS 2019) : Vancouver, Canada, 8-14 December 2019
Editors	Hanna Wallach, Hugo Larochelle
Number of pages	9
Volume	10
Place of Publication	Red Hook
Publisher	Curran Associates
Publication date	2020
Pages	7215-7223
ISBN (print)	978-1-71380-793-3
Publication status	Published - 2020
Event	33rd Conference on Neural Information Processing Systems - NeurIPS 2019 - Vancouver Convention Center, Vancouver, Canada Duration: 08.12.2019 → 14.12.2019 Conference number: 33 https://nips.cc/Conferences/2019

Bibliographical note

Richtige Zählung der Konferenz: 33rd Conference on Neural Information Processing Systems.
Copyright©(2019) by individual authors and Neural Information Processing Systems Foundation Inc. Printed with permission by Curran Associates, Inc. (2020)

Research areas

Business informatics

Other publications by the same author(s)

Interactive sequential generative models for team sports

Fassmeyer, D., Cordes, M. & Brefeld, U., 02.2025, In: Machine Learning. 114, 2, 15 p., 38.

Research output: Journal contributions › Journal articles › Research › peer-review

Joint Item Response Models for Manual and Automatic Scores on Open-Ended Test Items

Bengs, D., Brefeld, U., Kroehne, U. & Zehner, F., 01.09.2025, In: Psychometrika. 90, 4, p. 1346-1367 22 p.

Research output: Journal contributions › Journal articles › Research › peer-review

Machine Learning and Data Mining for Sports Analytics: 11th International Workshop, MLSA 2024, Vilnius, Lithuania, September 9, 2024, Revised Selected Papers

Brefeld, U. (Editor), Davis, J. (Editor), Van Haaren, J. (Editor) & Zimmermann, A. (Editor), 2025, Cham: Springer Verlag. 119 p. (Communications in Computer and Information Science; vol. 2460)

Research output: Books and anthologies › Conference proceedings › Research

Masked autoencoder for multiagent trajectories

Rudolph, Y. & Brefeld, U., 02.2025, In: Machine Learning. 114, 2, 18 p., 44.

Research output: Journal contributions › Journal articles › Research › peer-review

Self-improvement for Computerized Adaptive Testing

Rudolph, Y., Neubauer, K. & Brefeld, U., 2026, Machine Learning and Knowledge Discovery in Databases - Research Track: European Conference, ECML PKDD 2025, Porto, Portugal, September 15–19, 2025, Proceedings. Ribeiro, R. P., Jorge, A. M., Soares, C., Gama, J., Pfahringer, B., Japkowicz, N., Larrañaga, P. & Abreu, P. H. (eds.). Cham: Springer International Publishing, Vol. 2. p. 70-86 17 p. (Lecture Notes in Computer Science; vol. 16014 LNCS).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review