Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Standard

Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics. / Lommel, Lasse; Riebeling, Meike ; Funk, Burkhardt et al.
Human Practice. Digital Ecologies. Our Future: 14. Internationale Tagung Wirtschaftsinformatik (WI 2019), Tagungsband . ed. / Thomas Ludwig; Volkmar Pipek. Siegen: Universitätsverlag Siegen, 2019. p. 453-467.

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Harvard

Lommel, L, Riebeling, M, Funk, B & Junginger, C 2019, Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics. in T Ludwig & V Pipek (eds), Human Practice. Digital Ecologies. Our Future: 14. Internationale Tagung Wirtschaftsinformatik (WI 2019), Tagungsband . Universitätsverlag Siegen, Siegen, pp. 453-467, 14. Internationale Tagung Wirtschaftsinformatik - WI 2019, Siegen, Germany, 24.02.19. https://doi.org/10.25819/ubsi/1016

APA

Lommel, L., Riebeling, M., Funk, B., & Junginger, C. (2019). Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics. In T. Ludwig, & V. Pipek (Eds.), Human Practice. Digital Ecologies. Our Future: 14. Internationale Tagung Wirtschaftsinformatik (WI 2019), Tagungsband (pp. 453-467). Universitätsverlag Siegen. https://doi.org/10.25819/ubsi/1016

Vancouver

Lommel L, Riebeling M, Funk B, Junginger C. Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics. In Ludwig T, Pipek V, editors, Human Practice. Digital Ecologies. Our Future: 14. Internationale Tagung Wirtschaftsinformatik (WI 2019), Tagungsband . Siegen: Universitätsverlag Siegen. 2019. p. 453-467 doi: 10.25819/ubsi/1016

Bibtex

@inbook{5413e328ea194031b724e97ab07c0d4d,
title = "Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics",
abstract = "Traditional unsupervised topic modeling approaches like Latent Dirichlet Allocation (LDA) lack the ability to classify documents into a predefined set of topics. On the other hand, supervised methods require significant amounts of labeled data to perform well on such tasks. We develop a new unsupervised method based on word embeddings to classify documents into predefined topics. We evaluate the predictive performance of this novel approach and compare it to seeded LDA. We use a real-world dataset from online advertising, which is comprised of markedly short documents. Our results indicate the two methods may complement one another well, leading to remarkable sensitivity and precision scores of ensemble learners trained thereupon.",
keywords = "Business informatics, topic modeling, word embeddings, LDA, seeded LDA, topic modeling, word embeddings, LDA, seeded LDA",
author = "Lasse Lommel and Meike Riebeling and Burkhardt Funk and Christian Junginger",
year = "2019",
doi = "10.25819/ubsi/1016",
language = "English",
pages = "453--467",
editor = "Thomas Ludwig and Volkmar Pipek",
booktitle = "Human Practice. Digital Ecologies. Our Future",
publisher = "Universit{\"a}tsverlag Siegen",
address = "Germany",
note = "14. Internationale Tagung Wirtschaftsinformatik - WI 2019 ; Conference date: 24-02-2019 Through 27-02-2019",
url = "https://wi2019.de/, https://wi2019.de/call-for-papers/, https://wi2019.de/",

}

RIS

TY - CHAP

T1 - Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics

AU - Lommel, Lasse

AU - Riebeling, Meike

AU - Funk, Burkhardt

AU - Junginger, Christian

N1 - Conference code: 14

PY - 2019

Y1 - 2019

N2 - Traditional unsupervised topic modeling approaches like Latent Dirichlet Allocation (LDA) lack the ability to classify documents into a predefined set of topics. On the other hand, supervised methods require significant amounts of labeled data to perform well on such tasks. We develop a new unsupervised method based on word embeddings to classify documents into predefined topics. We evaluate the predictive performance of this novel approach and compare it to seeded LDA. We use a real-world dataset from online advertising, which is comprised of markedly short documents. Our results indicate the two methods may complement one another well, leading to remarkable sensitivity and precision scores of ensemble learners trained thereupon.

AB - Traditional unsupervised topic modeling approaches like Latent Dirichlet Allocation (LDA) lack the ability to classify documents into a predefined set of topics. On the other hand, supervised methods require significant amounts of labeled data to perform well on such tasks. We develop a new unsupervised method based on word embeddings to classify documents into predefined topics. We evaluate the predictive performance of this novel approach and compare it to seeded LDA. We use a real-world dataset from online advertising, which is comprised of markedly short documents. Our results indicate the two methods may complement one another well, leading to remarkable sensitivity and precision scores of ensemble learners trained thereupon.

KW - Business informatics

KW - topic modeling, word embeddings, LDA, seeded LDA

KW - topic modeling

KW - word embeddings

KW - LDA

KW - seeded LDA

UR - https://wi2019.de/tagungsband/

UR - https://wi2019.de/wp-content/uploads/Tagungsband_WI2019_reduziert.pdf

UR - https://www.universi.uni-siegen.de/katalog/einzelpublikationen/897618.html

U2 - 10.25819/ubsi/1016

DO - 10.25819/ubsi/1016

M3 - Article in conference proceedings

SP - 453

EP - 467

BT - Human Practice. Digital Ecologies. Our Future

A2 - Ludwig, Thomas

A2 - Pipek, Volkmar

PB - Universitätsverlag Siegen

CY - Siegen

T2 - 14. Internationale Tagung Wirtschaftsinformatik - WI 2019

Y2 - 24 February 2019 through 27 February 2019

ER -

Links

DOI

Recently viewed

Publications

  1. A Proposal for Integrating Theories of Complexity for Better Understanding Global Systemic Risks
  2. A Study on the Performance of Adaptive Neural Networks for Haze Reduction with a Focus on Precision
  3. A structural property of the wavelet packet transform method to localise incoherency of a signal
  4. Advances in Dynamics, Optimization and Computation
  5. Special Issue The Discourse of Redundancy Introduction
  6. Measuring Learning Styles with Questionnaires Versus Direct Observation of Preferential Choice Behavior in Authentic Learning Situations
  7. Robust Control of Mobile Transportation Object with 3D Technical Vision System
  8. Homogenization methods for multi-phase elastic composites
  9. Finding Creativity in Predictability: Seizing Kairos in Chronos Through Temporal Work in Complex Innovation Processes
  10. Return of Fibonacci random walks
  11. Restoring Causal Analysis to Structural Equation ModelingReview of Causality: Models, Reasoning, and Inference (2nd Edition), by Judea Pearl
  12. Dividing Apples and Pears: Towards a Taxonomy for Agile Transformation
  13. Guest Editorial Special Issue on Sensors in Machine Vision of Automated Systems
  14. Efficient Order Picking Methods in Robotic Mobile Fulfillment Systems
  15. Gain Adaptation in Sliding Mode Control Using Model Predictive Control and Disturbance Compensation with Application to Actuators
  16. Derivative approximation using a discrete dynamic system
  17. Overcoming Multi-legacy Application Challenges through Building Dynamic Capabilities for Low-Code Adoption
  18. Control Allocation and Controller Tuning for an Over-Actuated Hexacopter Tilt-Rotor Applied for Precision Agriculture
  19. Practice and carryover effects when using small interaction devices
  20. A PHENOMENOGRAPHICAL STUDY OF CHILDRENS’ SPATIAL THOUGHT WHILE USING MAPS IN REAL SPACES
  21. Probabilistic approach to modelling of recession curves
  22. Failure to Learn From Failure Is Mitigated by Loss-Framing and Corrective Feedback
  23. A Cross-Classified CFA-MTMM Model for Structurally Different and Nonindependent Interchangeable Methods
  24. Continuous and Discrete Concepts for Detecting Transport Barriers in the Planar Circular Restricted Three Body Problem
  25. Language and Mathematics - Key Factors influencing the Comprehension Process in reality-based Tasks
  26. Public perceptions of CCS in context
  27. THE PARALLAX OF INDIVIDUATION
  28. Neural relational inference for disaster multimedia retrieval
  29. The relationship between audit committees, external auditors, and internal control systems
  30. Representation for interactive exercises
  31. Memory Acts: Memory without Representation.