Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschungbegutachtet

Standard

Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics. / Lommel, Lasse; Riebeling, Meike ; Funk, Burkhardt et al.
Human Practice. Digital Ecologies. Our Future: 14. Internationale Tagung Wirtschaftsinformatik (WI 2019), Tagungsband . Hrsg. / Thomas Ludwig; Volkmar Pipek. Siegen: Universitätsverlag Siegen, 2019. S. 453-467.

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschungbegutachtet

Harvard

Lommel, L, Riebeling, M, Funk, B & Junginger, C 2019, Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics. in T Ludwig & V Pipek (Hrsg.), Human Practice. Digital Ecologies. Our Future: 14. Internationale Tagung Wirtschaftsinformatik (WI 2019), Tagungsband . Universitätsverlag Siegen, Siegen, S. 453-467, 14. Internationale Tagung Wirtschaftsinformatik - WI 2019, Siegen, Deutschland, 24.02.19. https://doi.org/10.25819/ubsi/1016

APA

Lommel, L., Riebeling, M., Funk, B., & Junginger, C. (2019). Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics. In T. Ludwig, & V. Pipek (Hrsg.), Human Practice. Digital Ecologies. Our Future: 14. Internationale Tagung Wirtschaftsinformatik (WI 2019), Tagungsband (S. 453-467). Universitätsverlag Siegen. https://doi.org/10.25819/ubsi/1016

Vancouver

Lommel L, Riebeling M, Funk B, Junginger C. Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics. in Ludwig T, Pipek V, Hrsg., Human Practice. Digital Ecologies. Our Future: 14. Internationale Tagung Wirtschaftsinformatik (WI 2019), Tagungsband . Siegen: Universitätsverlag Siegen. 2019. S. 453-467 doi: 10.25819/ubsi/1016

Bibtex

@inbook{5413e328ea194031b724e97ab07c0d4d,
title = "Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics",
abstract = "Traditional unsupervised topic modeling approaches like Latent Dirichlet Allocation (LDA) lack the ability to classify documents into a predefined set of topics. On the other hand, supervised methods require significant amounts of labeled data to perform well on such tasks. We develop a new unsupervised method based on word embeddings to classify documents into predefined topics. We evaluate the predictive performance of this novel approach and compare it to seeded LDA. We use a real-world dataset from online advertising, which is comprised of markedly short documents. Our results indicate the two methods may complement one another well, leading to remarkable sensitivity and precision scores of ensemble learners trained thereupon.",
keywords = "Business informatics, topic modeling, word embeddings, LDA, seeded LDA, topic modeling, word embeddings, LDA, seeded LDA",
author = "Lasse Lommel and Meike Riebeling and Burkhardt Funk and Christian Junginger",
year = "2019",
doi = "10.25819/ubsi/1016",
language = "English",
pages = "453--467",
editor = "Thomas Ludwig and Volkmar Pipek",
booktitle = "Human Practice. Digital Ecologies. Our Future",
publisher = "Universit{\"a}tsverlag Siegen",
address = "Germany",
note = "14. Internationale Tagung Wirtschaftsinformatik - WI 2019 ; Conference date: 24-02-2019 Through 27-02-2019",
url = "https://wi2019.de/, https://wi2019.de/call-for-papers/, https://wi2019.de/",

}

RIS

TY - CHAP

T1 - Topic Embeddings – A New Approach to Classify Very Short Documents Based on Predefined Topics

AU - Lommel, Lasse

AU - Riebeling, Meike

AU - Funk, Burkhardt

AU - Junginger, Christian

N1 - Conference code: 14

PY - 2019

Y1 - 2019

N2 - Traditional unsupervised topic modeling approaches like Latent Dirichlet Allocation (LDA) lack the ability to classify documents into a predefined set of topics. On the other hand, supervised methods require significant amounts of labeled data to perform well on such tasks. We develop a new unsupervised method based on word embeddings to classify documents into predefined topics. We evaluate the predictive performance of this novel approach and compare it to seeded LDA. We use a real-world dataset from online advertising, which is comprised of markedly short documents. Our results indicate the two methods may complement one another well, leading to remarkable sensitivity and precision scores of ensemble learners trained thereupon.

AB - Traditional unsupervised topic modeling approaches like Latent Dirichlet Allocation (LDA) lack the ability to classify documents into a predefined set of topics. On the other hand, supervised methods require significant amounts of labeled data to perform well on such tasks. We develop a new unsupervised method based on word embeddings to classify documents into predefined topics. We evaluate the predictive performance of this novel approach and compare it to seeded LDA. We use a real-world dataset from online advertising, which is comprised of markedly short documents. Our results indicate the two methods may complement one another well, leading to remarkable sensitivity and precision scores of ensemble learners trained thereupon.

KW - Business informatics

KW - topic modeling, word embeddings, LDA, seeded LDA

KW - topic modeling

KW - word embeddings

KW - LDA

KW - seeded LDA

UR - https://wi2019.de/tagungsband/

UR - https://wi2019.de/wp-content/uploads/Tagungsband_WI2019_reduziert.pdf

UR - https://www.universi.uni-siegen.de/katalog/einzelpublikationen/897618.html

U2 - 10.25819/ubsi/1016

DO - 10.25819/ubsi/1016

M3 - Article in conference proceedings

SP - 453

EP - 467

BT - Human Practice. Digital Ecologies. Our Future

A2 - Ludwig, Thomas

A2 - Pipek, Volkmar

PB - Universitätsverlag Siegen

CY - Siegen

T2 - 14. Internationale Tagung Wirtschaftsinformatik - WI 2019

Y2 - 24 February 2019 through 27 February 2019

ER -

Links

DOI

Zuletzt angesehen

Publikationen

  1. Effective digital practice in the competence-oriented English as a foreign language classroom in Germany
  2. The Open Anchoring Quest Dataset: Anchored Estimates from 96 Studies on Anchoring Effects
  3. DISKNET – A Platform for the Systematic Accumulation of Knowledge in IS Research
  4. "Die Arbeit funktioniert"
  5. A direct test of the similarity assumption — Focusing on differences as compared with similarities decreases automatic imitation
  6. Introduction
  7. IT Governance in Scaling Agile Frameworks
  8. Image compression based on periodic principal components
  9. Different facets of tree sapling diversity influence browsing intensity by deer dependent on spatial scale
  10. Action Errors, Error Management, and Learning in Organizations
  11. Quantifying diffuse and point inputs of perfluoroalkyl acids in a nonindustrial river catchment
  12. Dichotomy or continuum? A global review of the interaction between autonomous and planned adaptations
  13. Challenge-oriented policy making and innovation systems theory: reconsidering systemic instruments
  14. Determinants and Outcomes of Dual Distribution:
  15. Differentiating forest types using TerraSAR–X spotlight images based on inferential statistics and multivariate analysis
  16. Resource extraction technologies - is a more responsible path of development possible?
  17. Migration-Based Multilingualism in the English as a Foreign Language Classroom
  18. Commitment to grand challenges in fluid forms of organizing
  19. Enhanced Calculation Procedures for Material and Energy Flow Oriented EMIS
  20. Developing spatial biophysical accounting for multiple ecosystem services
  21. Understanding the error-structure of Time-driven Activity-based Costing
  22. Requests for mathematical reasoning in textbooks for primary-level students
  23. Methods for Ensuring the Accuracy of Radiometric and Optoelectronic Navigation Systems of Flying Robots in a Developed Infrastructure
  24. Hill–Chao numbers allow decomposing gamma multifunctionality into alpha and beta components