Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Standard

Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics. / Lommel, Lasse; Riebeling, Meike ; Funk, Burkhardt et al.
Human Practice. Digital Ecologies. Our Future: 14. Internationale Tagung Wirtschaftsinformatik (WI 2019), Tagungsband . ed. / Thomas Ludwig; Volkmar Pipek. Siegen: Universitätsverlag Siegen, 2019. p. 453-467.

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Harvard

Lommel, L, Riebeling, M, Funk, B & Junginger, C 2019, Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics. in T Ludwig & V Pipek (eds), Human Practice. Digital Ecologies. Our Future: 14. Internationale Tagung Wirtschaftsinformatik (WI 2019), Tagungsband . Universitätsverlag Siegen, Siegen, pp. 453-467, 14. Internationale Tagung Wirtschaftsinformatik - WI 2019, Siegen, Germany, 24.02.19. https://doi.org/10.25819/ubsi/1016

APA

Lommel, L., Riebeling, M., Funk, B., & Junginger, C. (2019). Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics. In T. Ludwig, & V. Pipek (Eds.), Human Practice. Digital Ecologies. Our Future: 14. Internationale Tagung Wirtschaftsinformatik (WI 2019), Tagungsband (pp. 453-467). Universitätsverlag Siegen. https://doi.org/10.25819/ubsi/1016

Vancouver

Lommel L, Riebeling M, Funk B, Junginger C. Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics. In Ludwig T, Pipek V, editors, Human Practice. Digital Ecologies. Our Future: 14. Internationale Tagung Wirtschaftsinformatik (WI 2019), Tagungsband . Siegen: Universitätsverlag Siegen. 2019. p. 453-467 doi: 10.25819/ubsi/1016

Bibtex

@inbook{5413e328ea194031b724e97ab07c0d4d,
title = "Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics",
abstract = "Traditional unsupervised topic modeling approaches like Latent Dirichlet Allocation (LDA) lack the ability to classify documents into a predefined set of topics. On the other hand, supervised methods require significant amounts of labeled data to perform well on such tasks. We develop a new unsupervised method based on word embeddings to classify documents into predefined topics. We evaluate the predictive performance of this novel approach and compare it to seeded LDA. We use a real-world dataset from online advertising, which is comprised of markedly short documents. Our results indicate the two methods may complement one another well, leading to remarkable sensitivity and precision scores of ensemble learners trained thereupon.",
keywords = "Business informatics, topic modeling, word embeddings, LDA, seeded LDA, topic modeling, word embeddings, LDA, seeded LDA",
author = "Lasse Lommel and Meike Riebeling and Burkhardt Funk and Christian Junginger",
year = "2019",
doi = "10.25819/ubsi/1016",
language = "English",
pages = "453--467",
editor = "Thomas Ludwig and Volkmar Pipek",
booktitle = "Human Practice. Digital Ecologies. Our Future",
publisher = "Universit{\"a}tsverlag Siegen",
address = "Germany",
note = "14. Internationale Tagung Wirtschaftsinformatik - WI 2019 ; Conference date: 24-02-2019 Through 27-02-2019",
url = "https://wi2019.de/, https://wi2019.de/call-for-papers/, https://wi2019.de/",

}

RIS

TY - CHAP

T1 - Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics

AU - Lommel, Lasse

AU - Riebeling, Meike

AU - Funk, Burkhardt

AU - Junginger, Christian

N1 - Conference code: 14

PY - 2019

Y1 - 2019

N2 - Traditional unsupervised topic modeling approaches like Latent Dirichlet Allocation (LDA) lack the ability to classify documents into a predefined set of topics. On the other hand, supervised methods require significant amounts of labeled data to perform well on such tasks. We develop a new unsupervised method based on word embeddings to classify documents into predefined topics. We evaluate the predictive performance of this novel approach and compare it to seeded LDA. We use a real-world dataset from online advertising, which is comprised of markedly short documents. Our results indicate the two methods may complement one another well, leading to remarkable sensitivity and precision scores of ensemble learners trained thereupon.

AB - Traditional unsupervised topic modeling approaches like Latent Dirichlet Allocation (LDA) lack the ability to classify documents into a predefined set of topics. On the other hand, supervised methods require significant amounts of labeled data to perform well on such tasks. We develop a new unsupervised method based on word embeddings to classify documents into predefined topics. We evaluate the predictive performance of this novel approach and compare it to seeded LDA. We use a real-world dataset from online advertising, which is comprised of markedly short documents. Our results indicate the two methods may complement one another well, leading to remarkable sensitivity and precision scores of ensemble learners trained thereupon.

KW - Business informatics

KW - topic modeling, word embeddings, LDA, seeded LDA

KW - topic modeling

KW - word embeddings

KW - LDA

KW - seeded LDA

UR - https://wi2019.de/tagungsband/

UR - https://wi2019.de/wp-content/uploads/Tagungsband_WI2019_reduziert.pdf

UR - https://www.universi.uni-siegen.de/katalog/einzelpublikationen/897618.html

U2 - 10.25819/ubsi/1016

DO - 10.25819/ubsi/1016

M3 - Article in conference proceedings

SP - 453

EP - 467

BT - Human Practice. Digital Ecologies. Our Future

A2 - Ludwig, Thomas

A2 - Pipek, Volkmar

PB - Universitätsverlag Siegen

CY - Siegen

T2 - 14. Internationale Tagung Wirtschaftsinformatik - WI 2019

Y2 - 24 February 2019 through 27 February 2019

ER -

Links

DOI

Recently viewed

Publications

  1. Adjustable automation and manoeuvre control in automated driving
  2. Backstepping-based Input-Output Linearization of a Peltier Element for Ice Clamping using an Unscented Kalman Filter
  3. Situated multiplying in primary school
  4. Performance of process-based models for simulation of grain N in crop rotations across Europe
  5. Oddih
  6. Passive Rotation of Rotational Joints and Its Computation Method
  7. Exploiting ConvNet diversity for flooding identification
  8. Denoising and harmonic detection using nonorthogonal wavelet packets in industrial applications
  9. Modellieren in der Sekundarstufe
  10. Making mutual learning tangible
  11. The effect of yield surface curvature change by cross hardening on forming limit diagrams of sheets
  12. Challenges for postdocs in Germany and beyond:
  13. Sustainable Consumption - Mapping the Terrain
  14. Implementing aspects of inquiry-based learning in secondary chemistry classes: a case study
  15. Integrating resilience thinking and optimisation for conservation
  16. An Integrative Framework of Environmental Management Accounting
  17. A robust model predictive control using a feedforward structure for a hybrid hydraulic piezo actuator in camless internal combustion engines
  18. Comparative study on the dehydrogenation properties of TiCl4-doped LiAlH4 using different doping techniques
  19. Evaluating a Bayesian Student Model of Decimal Misconceptions
  20. Design of Reliable Remobilisation Finger Implants with Geometry Elements of a Triple Periodic Minimal Surface Structure via Additive Manufacturing of Silicon Nitride
  21. Spectral Early-Warning Signals for Sudden Changes in Time-Dependent Flow Patterns
  22. Effect of gap distortion on the field splitting of collective modes in superfluid He3-B
  23. Formative assessment in inclusive mathematics education in secondary schools