Exploring the Use of the Pronoun I in German Academic Texts with Machine Learning
Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review
Standard
Informatik 2020 - Back to the future: 50. Jahrestagung der Gesellschaft für Informatik vom 28. September - 2. Oktober 2020, virtual. ed. / Ralf H. Reussner; Anne Koziolek; Robert Heinrich. Bonn: Gesellschaft für Informatik e.V., 2020. p. 1327-1333 (Lecture Notes in Informatics (LNI) – Proceedings; Vol. P307).
Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review
Harvard
APA
Vancouver
Bibtex
}
RIS
TY - CHAP
T1 - Exploring the Use of the Pronoun I in German Academic Texts with Machine Learning
AU - Andresen, Melanie
AU - Knorr, Dagmar
N1 - Conference code: 50
PY - 2020
Y1 - 2020
N2 - The use of the pronoun ich (‘I’) in academic language is a source of constant debate and a frequent cause of insecurity for students. We explore manually annotated instances of I from a German learner corpus. Using machine learning techniques, we investigate to what extent it is possible to automatically distinguish between different types of I usage (author I vs. narrator I). We additionally inspect which context words are good indicators of one type or the other. The results show that an automatic classification is not straightforward, but the distinctive features are in line with previous research. The results of the automatic classification are not perfect, but would greatly facilitate manual annotation. The distinctive words are in line with previous research and indicate that the author I is a more homogeneous class.
AB - The use of the pronoun ich (‘I’) in academic language is a source of constant debate and a frequent cause of insecurity for students. We explore manually annotated instances of I from a German learner corpus. Using machine learning techniques, we investigate to what extent it is possible to automatically distinguish between different types of I usage (author I vs. narrator I). We additionally inspect which context words are good indicators of one type or the other. The results show that an automatic classification is not straightforward, but the distinctive features are in line with previous research. The results of the automatic classification are not perfect, but would greatly facilitate manual annotation. The distinctive words are in line with previous research and indicate that the author I is a more homogeneous class.
KW - Language Studies
KW - Korpuslinguistik
KW - annotation
KW - Academic language
KW - German
KW - machine learning
KW - classification
KW - Academic language
KW - Annotation
KW - Classification
KW - German
KW - Machine learning
UR - http://www.scopus.com/inward/record.url?scp=85127357898&partnerID=8YFLogxK
UR - https://www.mendeley.com/catalogue/74ed2ea8-4157-36bd-9677-24ac22c76c5e/
U2 - 10.18420/inf2020_124
DO - 10.18420/inf2020_124
M3 - Article in conference proceedings
T3 - Lecture Notes in Informatics (LNI) – Proceedings
SP - 1327
EP - 1333
BT - Informatik 2020 - Back to the future
A2 - Reussner, Ralf H.
A2 - Koziolek, Anne
A2 - Heinrich, Robert
PB - Gesellschaft für Informatik e.V.
CY - Bonn
T2 - 50th Annual Conference of the German Informatics Society - INFORMATIK 2020
Y2 - 28 September 2020 through 2 October 2020
ER -