Knowledge-Enhanced Language Models Are Not Bias-Proof: Situated Knowledge and Epistemic Injustice in AI

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Standard

Knowledge-Enhanced Language Models Are Not Bias-Proof: Situated Knowledge and Epistemic Injustice in AI. / Kraft, Angelie; Soulier, Eloïse.
2024 ACM Conference on Fairness, Accountability, and Transparency, FAccT 2024. Association for Computing Machinery, Inc, 2024. p. 1433-1445 (2024 ACM Conference on Fairness, Accountability, and Transparency, FAccT 2024).

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Harvard

Kraft, A & Soulier, E 2024, Knowledge-Enhanced Language Models Are Not Bias-Proof: Situated Knowledge and Epistemic Injustice in AI. in 2024 ACM Conference on Fairness, Accountability, and Transparency, FAccT 2024. 2024 ACM Conference on Fairness, Accountability, and Transparency, FAccT 2024, Association for Computing Machinery, Inc, pp. 1433-1445, ACM Conference on Fairness, Accountability, and Transparency - FAccT 2024, Rio de Janeiro, Brazil, 03.06.24. https://doi.org/10.1145/3630106.3658981

APA

Kraft, A., & Soulier, E. (2024). Knowledge-Enhanced Language Models Are Not Bias-Proof: Situated Knowledge and Epistemic Injustice in AI. In 2024 ACM Conference on Fairness, Accountability, and Transparency, FAccT 2024 (pp. 1433-1445). (2024 ACM Conference on Fairness, Accountability, and Transparency, FAccT 2024). Association for Computing Machinery, Inc. https://doi.org/10.1145/3630106.3658981

Vancouver

Kraft A, Soulier E. Knowledge-Enhanced Language Models Are Not Bias-Proof: Situated Knowledge and Epistemic Injustice in AI. In 2024 ACM Conference on Fairness, Accountability, and Transparency, FAccT 2024. Association for Computing Machinery, Inc. 2024. p. 1433-1445. (2024 ACM Conference on Fairness, Accountability, and Transparency, FAccT 2024). doi: 10.1145/3630106.3658981

Bibtex

@inbook{c47f262e4a5147ff9eb2c0aef1bca5d3,
title = "Knowledge-Enhanced Language Models Are Not Bias-Proof: Situated Knowledge and Epistemic Injustice in AI",
abstract = "The factual inaccuracies ({"}hallucinations{"}) of large language models have recently inspired more research on knowledge-enhanced language modeling approaches. These are often assumed to enhance the overall trustworthiness and objectivity of language models. Meanwhile, the issue of bias is usually only mentioned as a limitation of statistical representations. This dissociation of knowledge-enhancement and bias is in line with previous research on AI engineers' assumptions about knowledge, which indicate that knowledge is commonly understood as objective and value-neutral by this community. We argue that claims and practices by actors of the field still reflect this underlying conception of knowledge. We contrast this assumption with literature from social and, in particular, feminist epistemology, which argues that the idea of a universal disembodied knower is blind to the reality of knowledge practices and seriously challenges claims of {"}objective{"}or {"}neutral{"}knowledge. Knowledge enhancement techniques commonly use Wikidata and Wikipedia as their sources for knowledge, due to their large scales, public accessibility, and assumed trustworthiness. In this work, they serve as a case study for the influence of the social setting and the identity of knowers on epistemic processes. Indeed, the communities behind Wikidata and Wikipedia are known to be male-dominated and many instances of hostile behavior have been reported in the past decade. In effect, the contents of these knowledge bases are highly biased. It is therefore doubtful that these knowledge bases would contribute to bias reduction. In fact, our empirical evaluations of RoBERTa, KEPLER, and CoLAKE, demonstrate that knowledge enhancement may not live up to the hopes of increased objectivity. In our study, the average probability for stereotypical associations was preserved on two out of three metrics and performance-related gender gaps on knowledge-driven task were also preserved. We build on these results and critical literature to argue that the label of {"}knowledge{"}and the commonly held beliefs about it can obscure the harm that is still done to marginalized groups. Knowledge enhancement is at risk of perpetuating epistemic injustice, and AI engineers' understanding of knowledge as objective per se conceals this injustice. Finally, to get closer to trustworthy language models, we need to rethink knowledge in AI and aim for an agenda of diversification and scrutiny from outgroup members.",
keywords = "bias, epistemology, fairness, feminism, knowledge enhancement, knowledge graphs, language models, natural language processing, representation, Informatics",
author = "Angelie Kraft and Elo{\"i}se Soulier",
note = "Publisher Copyright: {\textcopyright} 2024 Owner/Author.; ACM Conference on Fairness, Accountability, and Transparency - FAccT 2024, FAccT 2024 ; Conference date: 03-06-2024 Through 06-06-2024",
year = "2024",
month = jun,
day = "3",
doi = "10.1145/3630106.3658981",
language = "English",
isbn = "9798400704505",
series = "2024 ACM Conference on Fairness, Accountability, and Transparency, FAccT 2024",
publisher = "Association for Computing Machinery, Inc",
pages = "1433--1445",
booktitle = "2024 ACM Conference on Fairness, Accountability, and Transparency, FAccT 2024",
address = "United States",
url = "https://facctconference.org/2024/",

}

RIS

TY - CHAP

T1 - Knowledge-Enhanced Language Models Are Not Bias-Proof

T2 - ACM Conference on Fairness, Accountability, and Transparency - FAccT 2024

AU - Kraft, Angelie

AU - Soulier, Eloïse

N1 - Publisher Copyright: © 2024 Owner/Author.

PY - 2024/6/3

Y1 - 2024/6/3

N2 - The factual inaccuracies ("hallucinations") of large language models have recently inspired more research on knowledge-enhanced language modeling approaches. These are often assumed to enhance the overall trustworthiness and objectivity of language models. Meanwhile, the issue of bias is usually only mentioned as a limitation of statistical representations. This dissociation of knowledge-enhancement and bias is in line with previous research on AI engineers' assumptions about knowledge, which indicate that knowledge is commonly understood as objective and value-neutral by this community. We argue that claims and practices by actors of the field still reflect this underlying conception of knowledge. We contrast this assumption with literature from social and, in particular, feminist epistemology, which argues that the idea of a universal disembodied knower is blind to the reality of knowledge practices and seriously challenges claims of "objective"or "neutral"knowledge. Knowledge enhancement techniques commonly use Wikidata and Wikipedia as their sources for knowledge, due to their large scales, public accessibility, and assumed trustworthiness. In this work, they serve as a case study for the influence of the social setting and the identity of knowers on epistemic processes. Indeed, the communities behind Wikidata and Wikipedia are known to be male-dominated and many instances of hostile behavior have been reported in the past decade. In effect, the contents of these knowledge bases are highly biased. It is therefore doubtful that these knowledge bases would contribute to bias reduction. In fact, our empirical evaluations of RoBERTa, KEPLER, and CoLAKE, demonstrate that knowledge enhancement may not live up to the hopes of increased objectivity. In our study, the average probability for stereotypical associations was preserved on two out of three metrics and performance-related gender gaps on knowledge-driven task were also preserved. We build on these results and critical literature to argue that the label of "knowledge"and the commonly held beliefs about it can obscure the harm that is still done to marginalized groups. Knowledge enhancement is at risk of perpetuating epistemic injustice, and AI engineers' understanding of knowledge as objective per se conceals this injustice. Finally, to get closer to trustworthy language models, we need to rethink knowledge in AI and aim for an agenda of diversification and scrutiny from outgroup members.

AB - The factual inaccuracies ("hallucinations") of large language models have recently inspired more research on knowledge-enhanced language modeling approaches. These are often assumed to enhance the overall trustworthiness and objectivity of language models. Meanwhile, the issue of bias is usually only mentioned as a limitation of statistical representations. This dissociation of knowledge-enhancement and bias is in line with previous research on AI engineers' assumptions about knowledge, which indicate that knowledge is commonly understood as objective and value-neutral by this community. We argue that claims and practices by actors of the field still reflect this underlying conception of knowledge. We contrast this assumption with literature from social and, in particular, feminist epistemology, which argues that the idea of a universal disembodied knower is blind to the reality of knowledge practices and seriously challenges claims of "objective"or "neutral"knowledge. Knowledge enhancement techniques commonly use Wikidata and Wikipedia as their sources for knowledge, due to their large scales, public accessibility, and assumed trustworthiness. In this work, they serve as a case study for the influence of the social setting and the identity of knowers on epistemic processes. Indeed, the communities behind Wikidata and Wikipedia are known to be male-dominated and many instances of hostile behavior have been reported in the past decade. In effect, the contents of these knowledge bases are highly biased. It is therefore doubtful that these knowledge bases would contribute to bias reduction. In fact, our empirical evaluations of RoBERTa, KEPLER, and CoLAKE, demonstrate that knowledge enhancement may not live up to the hopes of increased objectivity. In our study, the average probability for stereotypical associations was preserved on two out of three metrics and performance-related gender gaps on knowledge-driven task were also preserved. We build on these results and critical literature to argue that the label of "knowledge"and the commonly held beliefs about it can obscure the harm that is still done to marginalized groups. Knowledge enhancement is at risk of perpetuating epistemic injustice, and AI engineers' understanding of knowledge as objective per se conceals this injustice. Finally, to get closer to trustworthy language models, we need to rethink knowledge in AI and aim for an agenda of diversification and scrutiny from outgroup members.

KW - bias

KW - epistemology

KW - fairness

KW - feminism

KW - knowledge enhancement

KW - knowledge graphs

KW - language models

KW - natural language processing

KW - representation

KW - Informatics

UR - http://www.scopus.com/inward/record.url?scp=85196640886&partnerID=8YFLogxK

UR - https://www.mendeley.com/catalogue/4139faa8-8124-30a9-9153-a28a00bcf95b/

U2 - 10.1145/3630106.3658981

DO - 10.1145/3630106.3658981

M3 - Article in conference proceedings

AN - SCOPUS:85196640886

SN - 9798400704505

T3 - 2024 ACM Conference on Fairness, Accountability, and Transparency, FAccT 2024

SP - 1433

EP - 1445

BT - 2024 ACM Conference on Fairness, Accountability, and Transparency, FAccT 2024

PB - Association for Computing Machinery, Inc

Y2 - 3 June 2024 through 6 June 2024

ER -

DOI

Recently viewed

Publications

  1. The Use of Anti-Windup Techniques in Didactic Level Systems
  2. Process simulation of friction extrusion of aluminum alloys
  3. Perfect anti-windup in output tracking scheme with preaction
  4. Higher productivity in importing German manufacturing firms
  5. Die Corporate Governance-Berichterstattung des Aufsichtsrats
  6. Einkommenssituation Selbständiger in der Europäischen Union
  7. On the geometric control of internal forces in power grasps
  8. The Meaning of Work for Post-retirement Employment Decisions
  9. Umgang mit Diversität und Heterogenität im Chemieunterricht
  10. Pädagogisch-psychologisches Professionswissen von Lehrkräften
  11. Lagrangian Coherent Sets as transport barriers in convection
  12. Quality of firms' imports and distance to countries of origin
  13. Modi der Verständlichkeit und die Magie des Unverständlichen
  14. International sojourn experience and personality development
  15. IGLU - Ergebnisse im internationalen und nationalen Vergleich
  16. Vector Fields Autonomous Control for Assistive Mobile Robots
  17. Zur interaktion koordinativer und propriozeptiver leistungen
  18. Zur Auswirkung der Höranstrengung auf das Arbeitsgedächtnis
  19. Die Förderung selbständigen Lernens im Mathematikunterricht
  20. Foreign ownership and firm performance in the German services
  21. Adaptive Lehrerinterventionen beim mathematischen Modellieren
  22. The use of knowledge in inter-organisational knowledge-networks
  23. Transductive support vector machines for structured variables
  24. Linguistically Responsive Teaching in Multilingual Classrooms
  25. Exports, R&D and productivity in German business services firms
  26. Short run comovement, persistent shocks and the business cycle
  27. Multifractality of overlapping non-uniform self-similar measures
  28. Using haar wavelets for fault detection in technical processes
  29. Temporary exports and characteristics of destination countries
  30. Toward spatial fit in the governance of global commodity flows
  31. Simulation based comparison of safety-stock calculation methods
  32. The impact of soft-skills training for entrepreneurs in Jamaica
  33. Opportunities and Drawbacks of Mobile Flood Protection Systems
  34. Sustainability-Related Innovation and Sustainability Management
  35. Using Fuzzy PD Controllers for Soft Motions in a Car-like Robot
  36. Compound forging of hot-extruded steel-reinforced aluminum parts
  37. On inhomogeneous Bernoulli convolutions and random power series
  38. Improving Mathematical Modelling by Fostering Measurement Sense
  39. Die Dioxide und Trioxide des Tropilidens, Synthesen – Thermolysen.
  40. U-model-based dynamic inversion control for quadrotor UAV systems