Crowdsourcing Swiss Dialect Transcriptions for Assessing Factors in Writing Variations
Publikation: Beiträge in Sammelwerken › Aufsätze in Konferenzbänden › Forschung › begutachtet
Authors
In this paper, we systematically analyze writing variations of Swiss German in two existing corpora with standard German glosses, a corpus of 10,000 short text messages and a corpus of transcribed oral history recordings (90,000 tokens). We show that neither resource is sufficient for assessing factors in writing variations of users and describe a data collection project involving a citizen science community for solving this problem. Laymen will independently and redundantly transcribe 1,200 short samples (15-20 seconds) of audio material in Swiss German according to their own best practice.
Originalsprache | Englisch |
---|---|
Titel | Proceedings of the 13th Conference on Natural Language Processing (KONVENS) : Bochum, GermanySeptember 19–21, 2016 |
Herausgeber | Stefanie Dipper, Friedrich Neubarth, Heike Zinsmeister |
Anzahl der Seiten | 6 |
Erscheinungsort | Bochum |
Verlag | Ruhr-Universität Bochum |
Erscheinungsdatum | 01.09.2016 |
Seiten | 62-67 |
Publikationsstatus | Erschienen - 01.09.2016 |
Extern publiziert | Ja |
Veranstaltung | 13th Conference on Natural Language Processing (KONVENS) - Linguistics Department / Ruhr-Universität Bochum, Bochum, Deutschland Dauer: 19.09.2016 → 21.09.2016 https://www.linguistics.rub.de/konvens16/ |
- Informatik