Crowdsourcing Swiss Dialect Transcriptions for Assessing Factors in Writing Variations
Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review
Standard
Proceedings of the 13th Conference on Natural Language Processing (KONVENS): Bochum, GermanySeptember 19–21, 2016. ed. / Stefanie Dipper; Friedrich Neubarth; Heike Zinsmeister. Bochum: Ruhr-Universität Bochum, 2016. p. 62-67 (Bochumer Linguistische Arbeitsberichte; Vol. 16).
Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review
Harvard
APA
Vancouver
Bibtex
}
RIS
TY - CHAP
T1 - Crowdsourcing Swiss Dialect Transcriptions for Assessing Factors in Writing Variations
AU - Clematide, Simon
AU - Frick, Karina
AU - Aeppli, Noëmi
AU - Goldmann, Jean-Philippe
PY - 2016/9/1
Y1 - 2016/9/1
N2 - In this paper, we systematically analyze writing variations of Swiss German in two existing corpora with standard German glosses, a corpus of 10,000 short text messages and a corpus of transcribed oral history recordings (90,000 tokens). We show that neither resource is sufficient for assessing factors in writing variations of users and describe a data collection project involving a citizen science community for solving this problem. Laymen will independently and redundantly transcribe 1,200 short samples (15-20 seconds) of audio material in Swiss German according to their own best practice.
AB - In this paper, we systematically analyze writing variations of Swiss German in two existing corpora with standard German glosses, a corpus of 10,000 short text messages and a corpus of transcribed oral history recordings (90,000 tokens). We show that neither resource is sufficient for assessing factors in writing variations of users and describe a data collection project involving a citizen science community for solving this problem. Laymen will independently and redundantly transcribe 1,200 short samples (15-20 seconds) of audio material in Swiss German according to their own best practice.
KW - Informatics
UR - http://www.scopus.com/inward/record.url?scp=85182736066&partnerID=8YFLogxK
M3 - Article in conference proceedings
T3 - Bochumer Linguistische Arbeitsberichte
SP - 62
EP - 67
BT - Proceedings of the 13th Conference on Natural Language Processing (KONVENS)
A2 - Dipper, Stefanie
A2 - Neubarth, Friedrich
A2 - Zinsmeister, Heike
PB - Ruhr-Universität Bochum
CY - Bochum
T2 - 13th Conference on Natural Language Processing (KONVENS)
Y2 - 19 September 2016 through 21 September 2016
ER -