Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschungbegutachtet

Authors

Traditional unsupervised topic modeling approaches like Latent Dirichlet Allocation (LDA) lack the ability to classify documents into a predefined set of topics. On the other hand, supervised methods require significant amounts of labeled data to perform well on such tasks. We develop a new unsupervised method based on word embeddings to classify documents into predefined topics. We evaluate the predictive performance of this novel approach and compare it to seeded LDA. We use a real-world dataset from online advertising, which is comprised of markedly short documents. Our results indicate the two methods may complement one another well, leading to remarkable sensitivity and precision scores of ensemble learners trained thereupon.
OriginalspracheEnglisch
TitelHuman Practice. Digital Ecologies. Our Future : 14. Internationale Tagung Wirtschaftsinformatik (WI 2019), Tagungsband
HerausgeberThomas Ludwig, Volkmar Pipek
Anzahl der Seiten15
ErscheinungsortSiegen
VerlagUniversitätsverlag Siegen
Erscheinungsdatum2019
Seiten453-467
ISBN (elektronisch)978-3-96182-063-4
DOIs
PublikationsstatusErschienen - 2019
Veranstaltung14. Internationale Tagung Wirtschaftsinformatik - WI 2019: Human Practice. Digital Ecologies. Our Future. - Universität Siegen, Institut für Wirtschaftsinformatik, Siegen, Deutschland
Dauer: 24.02.201927.02.2019
Konferenznummer: 14
https://wi2019.de/
https://wi2019.de/call-for-papers/
https://wi2019.de/

Links

DOI

Zuletzt angesehen

Publikationen

  1. Experiments on the Fehrer-Raab effect and the ‘Weather Station Model’ of visual backward masking
  2. Dichotomy or continuum? A global review of the interaction between autonomous and planned adaptations
  3. Automated scoring in the era of artificial intelligence
  4. 8th challenge on question answering over linked data (QALD-8)
  5. Deriving inferential statistics from recurrence plots
  6. Experience from downscaling IPCC-SRES scenarios to specific national-level focus scenarios for ecosystem service management
  7. Habitual Actions as a Challenge to the Standard Theory of Action
  8. Applied Conversation Analysis in Foreign Language Didactics
  9. Requests for mathematical reasoning in textbooks for primary-level students
  10. Mapping social values of ecosystem services: What is behind the map?
  11. Homogenization approach based on laminates
  12. The persistence of subsistence and the limits to development studies
  13. A Robust Approximated Derivative Action of a PID Regulator to be Applied in a Permanent Magnet Synchronous Motor Control
  14. Rebound Effects in Methods of Artificial Intelligence
  15. Is the market classification of risk always efficient?
  16. Third International Mathematics and Science Study and Trends in Mathematics and Science Studies (TIMSS)
  17. Development and characterisation of a new interface for coupling capillary LC with collision-cell ICPMS and its application for phosphorylation profiling of tryptic protein digests
  18. The significance of tree-tree interactions for forest ecosystem functioning
  19. Excellence in Teaching and Learning
  20. Is Calluna vulgaris a suitable bio-monitor of management-mediated nutrient pools in heathland ecosystems?
  21. Transformation products in the water cycle and the unsolved problem of their proactive assessment
  22. Knowledge Generation and Sustainable Development
  23. Bright Spots for Local WFD Implementation Through Collaboration with Nature Conservation Authorities?
  24. Assessing Exposure of Pesticides to Bees
  25. Web-Based Stress Management Program for University Students in Indonesia
  26. Multitrophic diversity in a biodiverse forest is highly nonlinear across spatial scales
  27. Effect of salinity-changing rates on filtration activity of mussels from two sites within the Baltic Mytilus hybrid zone
  28. Reprocessing from the inside
  29. ‘The Useful, the Bad and the Ugly’.
  30. Othering Space
  31. Science-Related Outcomes
  32. Skills and knowledge management in higher education
  33. Single-Word Recognition Need Not Depend on Single-Word Features
  34. Data quality assessment framework for critical raw materials. The case of cobalt