Crowdsourcing Swiss Dialect Transcriptions for Assessing Factors in Writing Variations

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschungbegutachtet

Authors

  • Simon Clematide
  • Karina Frick
  • Noëmi Aeppli
  • Jean-Philippe Goldmann
In this paper, we systematically analyze writing variations of Swiss German in two existing corpora with standard German glosses, a corpus of 10,000 short text messages and a corpus of transcribed oral history recordings (90,000 tokens). We show that neither resource is sufficient for assessing factors in writing variations of users and describe a data collection project involving a citizen science community for solving this problem. Laymen will independently and redundantly transcribe 1,200 short samples (15-20 seconds) of audio material in Swiss German according to their own best practice.
OriginalspracheEnglisch
TitelProceedings of the 13th Conference on Natural Language Processing (KONVENS) : Bochum, GermanySeptember 19–21, 2016
HerausgeberStefanie Dipper, Friedrich Neubarth, Heike Zinsmeister
Anzahl der Seiten6
ErscheinungsortBochum
VerlagRuhr-Universität Bochum
Erscheinungsdatum01.09.2016
Seiten62-67
PublikationsstatusErschienen - 01.09.2016
Extern publiziertJa
Veranstaltung13th Conference on Natural Language Processing (KONVENS) - Linguistics Department / Ruhr-Universität Bochum, Bochum, Deutschland
Dauer: 19.09.201621.09.2016
https://www.linguistics.rub.de/konvens16/

Dokumente

Links