Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Standard

Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics. / Lommel, Lasse; Riebeling, Meike ; Funk, Burkhardt et al.
Human Practice. Digital Ecologies. Our Future: 14. Internationale Tagung Wirtschaftsinformatik (WI 2019), Tagungsband . ed. / Thomas Ludwig; Volkmar Pipek. Siegen: Universitätsverlag Siegen, 2019. p. 453-467.

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Harvard

Lommel, L, Riebeling, M, Funk, B & Junginger, C 2019, Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics. in T Ludwig & V Pipek (eds), Human Practice. Digital Ecologies. Our Future: 14. Internationale Tagung Wirtschaftsinformatik (WI 2019), Tagungsband . Universitätsverlag Siegen, Siegen, pp. 453-467, 14. Internationale Tagung Wirtschaftsinformatik - WI 2019, Siegen, Germany, 24.02.19. https://doi.org/10.25819/ubsi/1016

APA

Lommel, L., Riebeling, M., Funk, B., & Junginger, C. (2019). Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics. In T. Ludwig, & V. Pipek (Eds.), Human Practice. Digital Ecologies. Our Future: 14. Internationale Tagung Wirtschaftsinformatik (WI 2019), Tagungsband (pp. 453-467). Universitätsverlag Siegen. https://doi.org/10.25819/ubsi/1016

Vancouver

Lommel L, Riebeling M, Funk B, Junginger C. Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics. In Ludwig T, Pipek V, editors, Human Practice. Digital Ecologies. Our Future: 14. Internationale Tagung Wirtschaftsinformatik (WI 2019), Tagungsband . Siegen: Universitätsverlag Siegen. 2019. p. 453-467 doi: 10.25819/ubsi/1016

Bibtex

@inbook{5413e328ea194031b724e97ab07c0d4d,
title = "Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics",
abstract = "Traditional unsupervised topic modeling approaches like Latent Dirichlet Allocation (LDA) lack the ability to classify documents into a predefined set of topics. On the other hand, supervised methods require significant amounts of labeled data to perform well on such tasks. We develop a new unsupervised method based on word embeddings to classify documents into predefined topics. We evaluate the predictive performance of this novel approach and compare it to seeded LDA. We use a real-world dataset from online advertising, which is comprised of markedly short documents. Our results indicate the two methods may complement one another well, leading to remarkable sensitivity and precision scores of ensemble learners trained thereupon.",
keywords = "Business informatics, topic modeling, word embeddings, LDA, seeded LDA, topic modeling, word embeddings, LDA, seeded LDA",
author = "Lasse Lommel and Meike Riebeling and Burkhardt Funk and Christian Junginger",
year = "2019",
doi = "10.25819/ubsi/1016",
language = "English",
pages = "453--467",
editor = "Thomas Ludwig and Volkmar Pipek",
booktitle = "Human Practice. Digital Ecologies. Our Future",
publisher = "Universit{\"a}tsverlag Siegen",
address = "Germany",
note = "14. Internationale Tagung Wirtschaftsinformatik - WI 2019 ; Conference date: 24-02-2019 Through 27-02-2019",
url = "https://wi2019.de/, https://wi2019.de/call-for-papers/, https://wi2019.de/",

}

RIS

TY - CHAP

T1 - Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics

AU - Lommel, Lasse

AU - Riebeling, Meike

AU - Funk, Burkhardt

AU - Junginger, Christian

N1 - Conference code: 14

PY - 2019

Y1 - 2019

N2 - Traditional unsupervised topic modeling approaches like Latent Dirichlet Allocation (LDA) lack the ability to classify documents into a predefined set of topics. On the other hand, supervised methods require significant amounts of labeled data to perform well on such tasks. We develop a new unsupervised method based on word embeddings to classify documents into predefined topics. We evaluate the predictive performance of this novel approach and compare it to seeded LDA. We use a real-world dataset from online advertising, which is comprised of markedly short documents. Our results indicate the two methods may complement one another well, leading to remarkable sensitivity and precision scores of ensemble learners trained thereupon.

AB - Traditional unsupervised topic modeling approaches like Latent Dirichlet Allocation (LDA) lack the ability to classify documents into a predefined set of topics. On the other hand, supervised methods require significant amounts of labeled data to perform well on such tasks. We develop a new unsupervised method based on word embeddings to classify documents into predefined topics. We evaluate the predictive performance of this novel approach and compare it to seeded LDA. We use a real-world dataset from online advertising, which is comprised of markedly short documents. Our results indicate the two methods may complement one another well, leading to remarkable sensitivity and precision scores of ensemble learners trained thereupon.

KW - Business informatics

KW - topic modeling, word embeddings, LDA, seeded LDA

KW - topic modeling

KW - word embeddings

KW - LDA

KW - seeded LDA

UR - https://wi2019.de/tagungsband/

UR - https://wi2019.de/wp-content/uploads/Tagungsband_WI2019_reduziert.pdf

UR - https://www.universi.uni-siegen.de/katalog/einzelpublikationen/897618.html

U2 - 10.25819/ubsi/1016

DO - 10.25819/ubsi/1016

M3 - Article in conference proceedings

SP - 453

EP - 467

BT - Human Practice. Digital Ecologies. Our Future

A2 - Ludwig, Thomas

A2 - Pipek, Volkmar

PB - Universitätsverlag Siegen

CY - Siegen

T2 - 14. Internationale Tagung Wirtschaftsinformatik - WI 2019

Y2 - 24 February 2019 through 27 February 2019

ER -

Links

DOI

Recently viewed

Researchers

  1. Pascal Frank

Publications

  1. Mathematics in Robot Control for Theoretical and Applied Problems
  2. Do connectives improve the level of understandability in mathematical reality-based tasks?
  3. Learning to rule
  4. Applications of the Simultaneous Modular Approach in the Field of Material Flow Analysis
  5. Internet of things and process performance improvements in manufacturing
  6. A Geometric Approach by Using Switching and Flatness Based Control in Electromechanical Actuators for Linear Motion
  7. Facing complexity through informed simplifications
  8. TRY plant trait database – enhanced coverage and open access
  9. In situ synchrotron radiation diffraction investigation of the compression behaviour at 350 °C of ZK40 alloys with addition of CaO and Y
  10. Introduction to the challenges and chances regarding the utilization of nitrogen-rich by-products and waste streams
  11. Implementing aspects of inquiry-based learning in secondary chemistry classes: a case study
  12. Intentionality
  13. Some surprising differences between novice and expert errors in computerized office work
  14. Optimal dynamic scale and structure of a multi-pollution economy
  15. Performance incentives in activity-based management
  16. The representative turn in EU studies
  17. Foreign bias in institutional portfolio allocation
  18. Group membership does not modulate goal- versus movement-based imitation
  19. A cascade regulator using Lyapunov's PID-PID controllers for an aggregate actuator in automotive applications
  20. Telecoupling as a framework to support a more nuanced understanding of causality in land system science
  21. Theorizing the Role of Metaphors in Co-orienting Collective Action Toward Grand Challenges
  22. ZooKeys, unlocking Earth's incredible biodiversity and building a sustainable bridge into the public domain: From "print-based" to "web-based" taxonomy, systematics, and natural history ZooKeys Editorial Opening Paper
  23. The Limits of Change
  24. Embracing scale-dependence to achieve a deeper understanding of biodiversity and its change across communities
  25. Practices and Policies from Spaces of Possibilities to Institutional Innovations
  26. Group formation in computer-supported collaborative learning
  27. The interplay between posture control and memory for spatial locations
  28. Performance of the DSM-5-based criteria for Internet addiction
  29. Ecologies of Making
  30. Using Long-Duration Static Stretch Training to Counteract Strength and Flexibility Deficits in Moderately Trained Participants
  31. Gas-Kampf oder Gas-Krampf
  32. Simulation-based Investigation of Energy Flexibility in the Optimization of Hinterland Drainage
  33. Collaborative design prototyping in transdisciplinary research