Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

Traditional unsupervised topic modeling approaches like Latent Dirichlet Allocation (LDA) lack the ability to classify documents into a predefined set of topics. On the other hand, supervised methods require significant amounts of labeled data to perform well on such tasks. We develop a new unsupervised method based on word embeddings to classify documents into predefined topics. We evaluate the predictive performance of this novel approach and compare it to seeded LDA. We use a real-world dataset from online advertising, which is comprised of markedly short documents. Our results indicate the two methods may complement one another well, leading to remarkable sensitivity and precision scores of ensemble learners trained thereupon.
Original languageEnglish
Title of host publicationHuman Practice. Digital Ecologies. Our Future : 14. Internationale Tagung Wirtschaftsinformatik (WI 2019), Tagungsband
EditorsThomas Ludwig, Volkmar Pipek
Number of pages15
Place of PublicationSiegen
PublisherUniversitätsverlag Siegen
Publication date2019
Pages453-467
ISBN (electronic)978-3-96182-063-4
DOIs
Publication statusPublished - 2019
Event14. Internationale Tagung Wirtschaftsinformatik - WI 2019: Human Practice. Digital Ecologies. Our Future. - Universität Siegen, Institut für Wirtschaftsinformatik, Siegen, Germany
Duration: 24.02.201927.02.2019
Conference number: 14
https://wi2019.de/
https://wi2019.de/call-for-papers/
https://wi2019.de/

Links

DOI

Recently viewed

Publications

  1. Grazing, exploring and networking for sustainability-oriented innovations in learning-action networks
  2. Using Local and Global Self-Evaluations to Predict Students' Problem Solving Behaviour
  3. On robustness properties in permanent magnet machine control by using decoupling controller
  4. Integrating the underlying structure of stochasticity into community ecology
  5. Using Complexity Metrics to Assess Silent Reading Fluency
  6. Multilevel bridge governor by using model predictive control in wavelet packets for tracking trajectories
  7. Experiments on the Fehrer-Raab effect and the ‘Weather Station Model’ of visual backward masking
  8. Springback prediction and reduction in deep drawing under influence of unloading modulus degradation
  9. Modeling of Logistic Processes in Assembly Areas
  10. A sensor fault detection scheme as a functional safety feature for DC-DC converters
  11. PI and Fuzzy Controllers for Non-Linear Systems
  12. Harvesting information from captions for weakly supervised semantic segmentation
  13. Understanding the socio-technical aspects of low-code adoption for software development
  14. On the Functional Controllability Using a Geometric Approach together with a Decoupled MPC for Motion Control in Robotino
  15. A Review of the Application of Machine Learning and Data Mining Approaches in Continuum Materials Mechanics
  16. Exploration strategies, performance, and error consequences when learning a complex computer task
  17. How to support synchronous net-based learning discourses
  18. Construct Objectification and De-Objectification in Organization Theory
  19. Development and validation of a method for the determination of trace alkylphenols and phthalates in the atmosphere
  20. Taking the pulse of Earth's tropical forests using networks of highly distributed plots
  21. Backstepping-based Input-Output Linearization of a Peltier Element for Ice Clamping using an Unscented Kalman Filter
  22. A Switching Cascade Sliding PID-PID Controllers Combined with a Feedforward and an MPC for an Actuator in Camless Internal Combustion Engines
  23. A lyapunov approach in the derivative approximation using a dynamic system
  24. Measuring cognitive load with subjective rating scales during problem solving
  25. Dynamic Lot Size Optimization with Reinforcement Learning
  26. Volume of Imbalance Container Prediction using Kalman Filter and Long Short-Term Memory
  27. Influence of Process Parameters and Die Design on the Microstructure and Texture Development of Direct Extruded Magnesium Flat Products
  28. Introducing parametric uncertainty into a nonlinear friction model
  29. Scholarly Question Answering Using Large Language Models in the NFDI4DataScience Gateway
  30. The Influence of Note-taking on Mathematical Solution Processes while Working on Reality-Based Tasks
  31. Database on Learning for Sustainable Development – analysis of projects
  32. The role of learners’ memory in app-based language instruction: the case of Duolingo.
  33. Creating regional (e-)learning networks
  34. Towards a spatial understanding of identity play
  35. A Lean Convolutional Neural Network for Vehicle Classification
  36. Effectiveness of a guided multicomponent internet and mobile gratitude training program - A pragmatic randomized controlled trial
  37. Interpreting Strings, Weaving Threads