Measuring Gender Bias in German Language Generation

Angelie Kraft; Hans Peter Zorn; Pascal Fecht; Judith Simon; Chris Biemann; Ricardo Usbeck

doi:10.18420/inf2022_108

Measuring Gender Bias in German Language Generation

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

Authors

Angelie Kraft
Hans Peter Zorn
Pascal Fecht
Judith Simon
Chris Biemann
Ricardo Usbeck

Most existing methods to measure social bias in natural language generation are specified for English language models. In this work, we developed a German regard classifier based on a newly crowd-sourced dataset. Our model meets the test set accuracy of the original English version. With the classifier, we measured binary gender bias in two large language models. The results indicate a positive bias toward female subjects for a German version of GPT-2 and similar tendencies for GPT-3. Yet, upon qualitative analysis, we found that positive regard partly corresponds to sexist stereotypes. Our findings suggest that the regard classifier should not be used as a single measure but, instead, combined with more qualitative analyses.

Original language	English
Title of host publication	INFORMATIK 2022 - Informatik in den Naturwissenschaften
Editors	Daniel Demmler, Daniel Krupka, Hannes Federrath
Number of pages	18
Place of Publication	Bonn
Publisher	Gesellschaft für Informatik e.V.
Publication date	2022
Pages	1257-1274
ISBN (electronic)	978-3-88579-720-3
DOIs	https://doi.org/10.18420/inf2022_108
Publication status	Published - 2022
Externally published	Yes
Event	52. Jahrestagung der Gesellschaft für Informatik - INFORMATIK 2022: Informatik in den Naturwissenschaften - UHH Gebäude in der Edmund-Siemers-Allee 1, Hamburg, Germany Duration: 26.09.2022 → 30.09.2022 Conference number: 52 https://informatik2022.gi.de/

Bibliographical note

This work presents and extends the results of Angelie Kraft's Master thesis at Universität Hamburg and inovex GmbH. Regarding any additional research and experimentation, we acknowledge the financial support from the Federal Ministry for Economic Affairs and Energy of Germany in the project CoyPu (project number 01MK21007[G]).

Publisher Copyright:
© 2022 Gesellschaft fur Informatik (GI). All rights reserved.

Research areas

gender bias, german, gpt-2, gpt-3, natural language generation, regard, stereotypes
Business informatics
Informatics

Sustainable Development Goals

SDG 5 - Gender Equality

Other publications by the same author(s)

ShortPathQA: A Dataset for Controllable Fusion of Large Language Models with Knowledge Graphs

Salnikov, M., Sakhovskiy, A., Nikishina, I., Usmanova, A., Kraft, A., Möller, C., Banerjee, D., Huang, J., Jiang, L., Abdullah, R., Yan, X., Tutubalina, E., Usbeck, R. & Panchenko, A., 2026, Natural Language Processing and Information Systems: 30th International Conference on Applications of Natural Language to Information Systems, NLDB 2025, Proceedings. Ichise, R. (ed.). Springer Science and Business Media Deutschland, p. 95-110 16 p. (Lecture Notes in Computer Science; vol. 15836 LNCS).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

Analyzing the Influence of Knowledge Graph Information on Relation Extraction.

Möller, C. & Usbeck, R., 2025

Research output: other publications › Other › Research

Analyzing the Influence of Knowledge Graph Information on Relation Extraction

Möller, C. & Usbeck, R., 2025, The Semantic Web: 22nd European Semantic Web Conference, ESWC 2025 Portoroz, Slovenia, June 1–5, 2025 Proceedings, Part I. Curry, E., Acosta, M., Poveda-Villalón, M., van Erp, M., Ojo, A., Hose, K., Shimizu, C. & Lisena, P. (eds.). Cham: Springer Nature Switzerland AG, Vol. 1. p. 460-480 21 p. (Lecture Notes in Computer Science ; vol. 15718).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

ASK-DBLP: Answering Questions over DBLP

Taffa, T., Neises, P., Ollinger, S., Westphal, P., Ackermann, M. R., Banerjee, D. & Usbeck, R., 02.11.2025, ISWC-C 2025, Industry, Doctoral Consortium, Posters and Demos at ISWC 2025: Joint Proceedings of Industry, Doctoral Consortium, Posters and Demos of the 24th International Semantic Web Conference (ISWC-C 2025), ISWC 2025 Companion Volume. Celino, I., Hassanzadeh, O., Bernstein, A., Noy, N., Cheng, G., Wang, S., Ferrada, S., Soulard, T., Kozaki, K., Takeda, H. & Gentile, A. L. (eds.). Aachen: Sun Site Central Europe (RWTH Aachen University), p. 435-440 6 p. D13. (CEUR Workshop Proceedings; vol. 4085).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

Automating SPARQL Query Translations between DBpedia and Wikidata

Bartels, M. C., Banerjee, D. & Usbeck, R., 14.07.2025, Linking Meaning: Semantic Technologies Shaping the Future of AI: Cover 74617 Proceedings of the 21st International Conference on Semantic Systems, 3-5 September 2025, Vienna, Austria. Spahiu, B., Vahdati, S., Salatino, A., Pellegrini, T. & Havur, G. (eds.). IOS Press BV, p. 176-193 18 p. (Studies on the Semantic Web; vol. 62).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research

DOI

https://doi.org/10.18420/inf2022_108
Final published version

Measuring Gender Bias in German Language Generation

Authors

Bibliographical note

Research areas

Sustainable Development Goals

Other publications by the same author(s)

ShortPathQA: A Dataset for Controllable Fusion of Large Language Models with Knowledge Graphs

Analyzing the Influence of Knowledge Graph Information on Relation Extraction.

Analyzing the Influence of Knowledge Graph Information on Relation Extraction

ASK-DBLP: Answering Questions over DBLP

Automating SPARQL Query Translations between DBpedia and Wikidata

DOI

Recently viewed

Researchers

Organisations

Activities

Prizes

Publications

Press / Media