Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

Traditional unsupervised topic modeling approaches like Latent Dirichlet Allocation (LDA) lack the ability to classify documents into a predefined set of topics. On the other hand, supervised methods require significant amounts of labeled data to perform well on such tasks. We develop a new unsupervised method based on word embeddings to classify documents into predefined topics. We evaluate the predictive performance of this novel approach and compare it to seeded LDA. We use a real-world dataset from online advertising, which is comprised of markedly short documents. Our results indicate the two methods may complement one another well, leading to remarkable sensitivity and precision scores of ensemble learners trained thereupon.
Original languageEnglish
Title of host publicationHuman Practice. Digital Ecologies. Our Future : 14. Internationale Tagung Wirtschaftsinformatik (WI 2019), Tagungsband
EditorsThomas Ludwig, Volkmar Pipek
Number of pages15
Place of PublicationSiegen
PublisherUniversitätsverlag Siegen
Publication date2019
Pages453-467
ISBN (electronic)978-3-96182-063-4
DOIs
Publication statusPublished - 2019
Event14. Internationale Tagung Wirtschaftsinformatik - WI 2019: Human Practice. Digital Ecologies. Our Future. - Universität Siegen, Institut für Wirtschaftsinformatik, Siegen, Germany
Duration: 24.02.201927.02.2019
Conference number: 14
https://wi2019.de/
https://wi2019.de/call-for-papers/
https://wi2019.de/

Links

DOI

Recently viewed

Publications

  1. A tutorial introduction to adaptive fractal analysis
  2. Template-based Question Answering using Recursive Neural Networks
  3. A sensor fault detection scheme as a functional safety feature for DC-DC converters
  4. Evaluating structural and compositional canopy characteristics to predict the light-demand signature of the forest understorey in mixed, semi-natural temperate forests
  5. lp-Norm Multiple Kernel Learning
  6. Design optimization of spiral coils for textile applications by genetic algorithm
  7. Exact and approximate inference for annotating graphs with structural SVMs
  8. Fast, Fully Automated Analysis of Voriconazole from Serum by LC-LC-ESI-MS-MS with Parallel Column-Switching Technique
  9. Recurrence Quantification Analysis of Processes and Products of Discourse
  10. Lessons learned for spatial modelling of ecosystem services in support of ecosystem accounting
  11. Construct Objectification and De-Objectification in Organization Theory
  12. Computational modeling of amorphous polymers
  13. Modeling and numerical simulation of multiscale behavior in polycrystals via extended crystal plasticity
  14. Influence of Process Parameters and Die Design on the Microstructure and Texture Development of Direct Extruded Magnesium Flat Products
  15. Simple saturated PID control for fast transient of motion systems
  16. Dynamic Lot Size Optimization with Reinforcement Learning
  17. The delay vector variance method and the recurrence quantification analysis of energy markets
  18. Introducing parametric uncertainty into a nonlinear friction model
  19. Faulty Process Detection Using Machine Learning Techniques
  20. TextGraphs 2024 Shared Task on Text-Graph Representations for Knowledge Graph Question Answering
  21. Clause identification using entropy guided transformation learning
  22. Mathematical Modeling for Robot 3D Laser Scanning in Complete Darkness Environments to Advance Pipeline Inspection
  23. Dispatching rule selection with Gaussian processes
  24. Constraints are the solution, not the problem
  25. Dynamic priority based dispatching of AGVs in flexible job shops
  26. Mining positional data streams
  27. Understanding the properties of isospectral points and pairs in graphs
  28. Improving students’ science text comprehension through metacognitive self-regulation when applying learning strategies
  29. Comments on "Tracking Control of Robotic Manipulators With Uncertain Kinematics and Dynamics"
  30. Computing regression statistics from grouped data
  31. From Knowledge to Application
  32. Gaussian processes for dispatching rule selection in production scheduling