Knowledge-Enhanced Language Models Are Not Bias-Proof: Situated Knowledge and Epistemic Injustice in AI

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschungbegutachtet

Authors

The factual inaccuracies ("hallucinations") of large language models have recently inspired more research on knowledge-enhanced language modeling approaches. These are often assumed to enhance the overall trustworthiness and objectivity of language models. Meanwhile, the issue of bias is usually only mentioned as a limitation of statistical representations. This dissociation of knowledge-enhancement and bias is in line with previous research on AI engineers' assumptions about knowledge, which indicate that knowledge is commonly understood as objective and value-neutral by this community. We argue that claims and practices by actors of the field still reflect this underlying conception of knowledge. We contrast this assumption with literature from social and, in particular, feminist epistemology, which argues that the idea of a universal disembodied knower is blind to the reality of knowledge practices and seriously challenges claims of "objective"or "neutral"knowledge. Knowledge enhancement techniques commonly use Wikidata and Wikipedia as their sources for knowledge, due to their large scales, public accessibility, and assumed trustworthiness. In this work, they serve as a case study for the influence of the social setting and the identity of knowers on epistemic processes. Indeed, the communities behind Wikidata and Wikipedia are known to be male-dominated and many instances of hostile behavior have been reported in the past decade. In effect, the contents of these knowledge bases are highly biased. It is therefore doubtful that these knowledge bases would contribute to bias reduction. In fact, our empirical evaluations of RoBERTa, KEPLER, and CoLAKE, demonstrate that knowledge enhancement may not live up to the hopes of increased objectivity. In our study, the average probability for stereotypical associations was preserved on two out of three metrics and performance-related gender gaps on knowledge-driven task were also preserved. We build on these results and critical literature to argue that the label of "knowledge"and the commonly held beliefs about it can obscure the harm that is still done to marginalized groups. Knowledge enhancement is at risk of perpetuating epistemic injustice, and AI engineers' understanding of knowledge as objective per se conceals this injustice. Finally, to get closer to trustworthy language models, we need to rethink knowledge in AI and aim for an agenda of diversification and scrutiny from outgroup members.

OriginalspracheEnglisch
Titel2024 ACM Conference on Fairness, Accountability, and Transparency, FAccT 2024
Anzahl der Seiten13
VerlagAssociation for Computing Machinery, Inc
Erscheinungsdatum03.06.2024
Seiten1433-1445
ISBN (Print)9798400704505
ISBN (elektronisch)979-8-4007-0450-5
DOIs
PublikationsstatusErschienen - 03.06.2024
VeranstaltungACM Conference on Fairness, Accountability, and Transparency - FAccT 2024 - Rio de Janeiro, Brasilien
Dauer: 03.06.202406.06.2024
https://facctconference.org/2024/

Bibliographische Notiz

Publisher Copyright:
© 2024 Owner/Author.

DOI

Zuletzt angesehen

Publikationen

  1. Increased auditor independence by external rotation and separating audit and non audit duties?
  2. On the computation of the warping function and the torsional properties of thin-walled crosssections of prismatic beams
  3. Modeling self-determination theory motivation data by using unfolding IRT
  4. Quantification of amino acids in fermentation media by isocratic HPLC analysis of their
  5. Multilingual disambiguation of named entities using linked data
  6. Quantum computing
  7. Mechanistic Realization of the Turtle Shell
  8. Explaining implementation deficits through multi-level governance in the EU's new member states
  9. Implementing aspects of inquiry-based learning in secondary chemistry classes: a case study
  10. E-privacy concerns
  11. Schooling, local knowledge and working memory
  12. Intentionality
  13. Scaling-based Least Squares Methods with Implemented Kalman filter Approach for Nano-Parameters Identification
  14. Hot forging of cast magnesium alloy TX31 using semi-closed die and its finite element simulation
  15. Teaching Sustainable Development in a Sensory and Artful Way — Concepts, Methods, and Examples
  16. Analyzing the Influence of Knowledge Graph Information on Relation Extraction
  17. Treating dialogue quality evaluation as an anomaly detection problem
  18. Performance concepts and performance theory
  19. The impact of linguistic complexity on the solution of mathematical modelling tasks
  20. Portuguese part-of-speech tagging with large margin structure learning
  21. Fusion of knowledge bases for better navigation of wheeled mobile robotic group with 3D TVS
  22. Analysis of the construction of an autonomous robot to improve its energy efficiency when traveling through irregular terrain
  23. Automatic Tuning of Extended Kalman Filter in Synchronous Reluctance Motor Drives with a Master-Slave Configuration
  24. Sensorless Control of AC Motor Drives with Adaptive Extended Kalman Filter
  25. Rapid grain refinement and compositional homogenization in a cast binary Cu50Ni alloy achieved by friction stir processing
  26. Improving Flood Forecasting in a Developing Country
  27. Influence of Mg content in Al alloys on processing characteristics and dynamically recrystallized microstructure of friction surfacing deposits

Presse / Medien

  1. Rio+20