Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

Traditional unsupervised topic modeling approaches like Latent Dirichlet Allocation (LDA) lack the ability to classify documents into a predefined set of topics. On the other hand, supervised methods require significant amounts of labeled data to perform well on such tasks. We develop a new unsupervised method based on word embeddings to classify documents into predefined topics. We evaluate the predictive performance of this novel approach and compare it to seeded LDA. We use a real-world dataset from online advertising, which is comprised of markedly short documents. Our results indicate the two methods may complement one another well, leading to remarkable sensitivity and precision scores of ensemble learners trained thereupon.
Original languageEnglish
Title of host publicationHuman Practice. Digital Ecologies. Our Future : 14. Internationale Tagung Wirtschaftsinformatik (WI 2019), Tagungsband
EditorsThomas Ludwig, Volkmar Pipek
Number of pages15
Place of PublicationSiegen
PublisherUniversitätsverlag Siegen
Publication date2019
Pages453-467
ISBN (electronic)978-3-96182-063-4
DOIs
Publication statusPublished - 2019
Event14. Internationale Tagung Wirtschaftsinformatik - WI 2019: Human Practice. Digital Ecologies. Our Future. - Universität Siegen, Institut für Wirtschaftsinformatik, Siegen, Germany
Duration: 24.02.201927.02.2019
Conference number: 14
https://wi2019.de/
https://wi2019.de/call-for-papers/
https://wi2019.de/

Links

DOI

Recently viewed

Publications

  1. Grazing, exploring and networking for sustainability-oriented innovations in learning-action networks
  2. Integrating the underlying structure of stochasticity into community ecology
  3. Validation of an open source, remote web-based eye-tracking method (WebGazer) for research in early childhood
  4. Globally asymptotic output feedback tracking of robot manipulators with actuator constraints
  5. XOperator - Interconnecting the semantic web and instant messaging networks
  6. Dynamically changing sequencing rules with reinforcement learning in a job shop system with stochastic influences
  7. Experiments on the Fehrer-Raab effect and the ‘Weather Station Model’ of visual backward masking
  8. Parking space management through deep learning – an approach for automated, low-cost and scalable real-time detection of parking space occupancy
  9. Lyapunov stability analysis to set up a PI controller for a mass flow system in case of a non-saturating input
  10. Springback prediction and reduction in deep drawing under influence of unloading modulus degradation
  11. Different kinds of interactive exercises with response analysis on the web
  12. Harvesting information from captions for weakly supervised semantic segmentation
  13. Understanding the socio-technical aspects of low-code adoption for software development
  14. Introduction Mobile Digital Practices. Situating People, Things, and Data
  15. On the Functional Controllability Using a Geometric Approach together with a Decoupled MPC for Motion Control in Robotino
  16. Fast, Fully Automated Analysis of Voriconazole from Serum by LC-LC-ESI-MS-MS with Parallel Column-Switching Technique
  17. Not in the world: philosophy, anarchism and real alterity
  18. Sustainability-related co-operation among audit committees, internal auditors and external auditors: a survey-based study
  19. How school leadership and innovation shape instructional pathways to student achievement across nations