Exploring the Use of the Pronoun I in German Academic Texts with Machine Learning

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschungbegutachtet


The use of the pronoun ich (‘I’) in academic language is a source of constant debate and a frequent cause of insecurity for students. We explore manually annotated instances of I from a German learner corpus. Using machine learning techniques, we investigate to what extent it is possible to automatically distinguish between different types of I usage (author I vs. narrator I). We additionally inspect which context words are good indicators of one type or the other. The results show that an automatic classification is not straightforward, but the distinctive features are in line with previous research. The results of the automatic classification are not perfect, but would greatly facilitate manual annotation. The distinctive words are in line with previous research and indicate that the author I is a more homogeneous class.
Titel in ÜbersetzungErforschung der Verwendung des Pronomen Ich in deutschen akademischen Texten mit maschinellem Lernen
TitelInformatik 2020 - Back to the future : 50. Jahrestagung der Gesellschaft für Informatik vom 28. September - 2. Oktober 2020, virtual
HerausgeberRalf Reussner, Anne Koziolek, Robert Heinrich
Anzahl der Seiten7
VerlagGesellschaft für Informatik e.V.
ISBN (elektronisch)978-3-88579-701-2
PublikationsstatusErschienen - 2020
Veranstaltung50. Jahrestagung der Gesellschaft für Informatik - GI 2020: Informatik 2020 - Back to the future - Karlsruher Institut für Technologie, Karlsruhe, Deutschland
Dauer: 28.09.202002.10.2020
Konferenznummer: 50

Zugehörige Projekte