Measuring Gender Bias in German Language Generation
Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review
Authors
Most existing methods to measure social bias in natural language generation are specified for English language models. In this work, we developed a German regard classifier based on a newly crowd-sourced dataset. Our model meets the test set accuracy of the original English version. With the classifier, we measured binary gender bias in two large language models. The results indicate a positive bias toward female subjects for a German version of GPT-2 and similar tendencies for GPT-3. Yet, upon qualitative analysis, we found that positive regard partly corresponds to sexist stereotypes. Our findings suggest that the regard classifier should not be used as a single measure but, instead, combined with more qualitative analyses.
Original language | English |
---|---|
Title of host publication | INFORMATIK 2022 - Informatik in den Naturwissenschaften |
Editors | Daniel Demmler, Daniel Krupka, Hannes Federrath |
Number of pages | 18 |
Place of Publication | Bonn |
Publisher | Gesellschaft für Informatik e.V. |
Publication date | 2022 |
Pages | 1257-1274 |
ISBN (electronic) | 978-3-88579-720-3 |
DOIs | |
Publication status | Published - 2022 |
Externally published | Yes |
Event | 52. Jahrestagung der Gesellschaft für Informatik - INFORMATIK 2022: Informatik in den Naturwissenschaften - UHH Gebäude in der Edmund-Siemers-Allee 1, Hamburg, Germany Duration: 26.09.2022 → 30.09.2022 Conference number: 52 https://informatik2022.gi.de/ |
Bibliographical note
This work presents and extends the results of Angelie Kraft's Master thesis at Universität Hamburg and inovex GmbH. Regarding any additional research and experimentation, we acknowledge the financial support from the Federal Ministry for Economic Affairs and Energy of Germany in the project CoyPu (project number 01MK21007[G]).
Publisher Copyright:
© 2022 Gesellschaft fur Informatik (GI). All rights reserved.
- gender bias, german, gpt-2, gpt-3, natural language generation, regard, stereotypes
- Business informatics
- Informatics