Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

Traditional unsupervised topic modeling approaches like Latent Dirichlet Allocation (LDA) lack the ability to classify documents into a predefined set of topics. On the other hand, supervised methods require significant amounts of labeled data to perform well on such tasks. We develop a new unsupervised method based on word embeddings to classify documents into predefined topics. We evaluate the predictive performance of this novel approach and compare it to seeded LDA. We use a real-world dataset from online advertising, which is comprised of markedly short documents. Our results indicate the two methods may complement one another well, leading to remarkable sensitivity and precision scores of ensemble learners trained thereupon.
Original languageEnglish
Title of host publicationHuman Practice. Digital Ecologies. Our Future : 14. Internationale Tagung Wirtschaftsinformatik (WI 2019), Tagungsband
EditorsThomas Ludwig, Volkmar Pipek
Number of pages15
Place of PublicationSiegen
PublisherUniversitätsverlag Siegen
Publication date2019
Pages453-467
ISBN (electronic)978-3-96182-063-4
DOIs
Publication statusPublished - 2019
Event14. Internationale Tagung Wirtschaftsinformatik - WI 2019: Human Practice. Digital Ecologies. Our Future. - Universität Siegen, Institut für Wirtschaftsinformatik, Siegen, Germany
Duration: 24.02.201927.02.2019
Conference number: 14
https://wi2019.de/
https://wi2019.de/call-for-papers/
https://wi2019.de/

Links

DOI

Recently viewed

Activities

  1. International Conference on Architecture of Computing Systems - ARCS2006
  2. An opening of management theory? Some consequences of Niklas Luhmann's notion of contingency for management thinking
  3. An Adaptive Resonance Regulator for an Actuator using Periodic Signals in Camless Engine Systems
  4. User Modeling and User-Adapted Interaction: The Journal of Personalization Research (Zeitschrift)
  5. HyperKult XVI - Computer als Medium: Medium Computer - 2007
  6. Micro and macro scale behavior of thermochemical materials in pure and composite forms for thermal storage applications
  7. Clustering (Spatial) Relationships to unveil small-scale Schooling Markets
  8. LC-MS identification of the photo-transformation products of desipramine with studying the effect of different environmental variables on the kinetics of their formation
  9. Where To Start? Exploring 1-Year-Students’ Preconceptions of Sustainable Development
  10. Lena Meyer-Bergner’s conception of modernism between graphics and weaving, between folk art and technology
  11. Methods for Ph.D.
  12. Does resin represent a neglected component of bee ecology? A comparison between Old and New World bees.
  13. What do we educate for? Critical thinking and reflection as key concepts for a contemporary higher education
  14. Meaningful Classroom Music: A Blended Learning Approach
  15. Using cardiovascular measures to integrate two theories: motivational intensity theory and mental contrasting
  16. Perspective Rules! 2017
  17. Unpacking multiple levels of governance in participatory environmental decision-making (with O. Fritsch)
  18. HyperKult 14
  19. Xtended Sampling II 1998
  20. Seminar "Media Architecture" - 2006

Publications

  1. XOperator - Interconnecting the semantic web and instant messaging networks
  2. Template-based Question Answering using Recursive Neural Networks
  3. Evaluating structural and compositional canopy characteristics to predict the light-demand signature of the forest understorey in mixed, semi-natural temperate forests
  4. Development and validation of a method for the determination of trace alkylphenols and phthalates in the atmosphere
  5. A Wavelet Packet Algorithm for Online Detection of Pantograph Vibrations
  6. Lyapunov Convergence Analysis for Asymptotic Tracking Using Forward and Backward Euler Approximation of Discrete Differential Equations
  7. The signal location task as a method quantifying the distribution of attention
  8. Evaluation of standard ERP software implementation approaches in terms of their capability for business process optimization
  9. Data based analysis of order processing strategies to support the positioning between conflicting economic and logistic objectives
  10. Situated multiplying in primary school
  11. Constraints are the solution, not the problem
  12. Psychometric approaches to language testing and linguistic profiling
  13. Interfaces between second language acquisition and the common European framework of reference :
  14. Mechanical behavior, microstructural evolution and texture analysis of AA2024-T351 processed by multi-layer friction surfacing with high build rates
  15. Guest Editors' Introduction
  16. Sprachen in Liechtenstein
  17. Bird's Response to Revegetation of Different Structure and Floristics-Are "Restoration Plantings" Restoring Bird Communities?
  18. Modeling Converging Material Flows In The Supply Chain
  19. Exploring feedback and student characteristics relevant for personalizing feedback strategies
  20. Reframing the technosphere
  21. Tree phylogenetic diversity structures multitrophic communities
  22. The Role of Zn Additions on the Microstructure and Mechanical Properties of Mg–Nd–Zn Alloys
  23. Algorithmic Trading, Artificial Intelligence and the Politics of Cognition
  24. Simulation of SARS-CoV-2 pandemic in Germany with ordinary differential equations in MATLAB
  25. Modeling High Aswan Dam Reservoir Morphology Using Remote Sensing to Reduce Evaporation