The Role of Output Vocabulary in T2T LMs for SPARQL Semantic Parsing

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschungbegutachtet

Authors

In this work, we analyse the role of output vocabulary for text-to-text (T2T) models on the task of SPARQL semantic parsing. We perform experiments within the the context of knowledge graph question answering (KGQA), where the task is to convert questions in natural language to the SPARQL query language. We observe that the query vocabulary is distinct from human vocabulary. Language Models (LMs) are pre-dominantly trained for human language tasks, and hence, if the query vocabulary is replaced with a vocabulary more attuned to the LM tokenizer, the performance of models may improve. We carry out carefully selected vocabulary substitutions on the queries and find absolute gains in the range of 17% on the GrailQA dataset.
OriginalspracheEnglisch
TitelFindings of the Association for Computational Linguistics: ACL 2023 : July 9-14, 2023
HerausgeberAnna Rogers, Jordan L. Boyd-Graber, Naoaki Okazaki
Anzahl der Seiten10
ErscheinungsortStroudsburg
VerlagAssociation for Computational Linguistics (ACL)
Erscheinungsdatum01.07.2023
Seiten12219-12228
ISBN (elektronisch)978-1-959429-62-3
DOIs
PublikationsstatusErschienen - 01.07.2023
Extern publiziertJa
Veranstaltung61st Annual Meeting of the Association for Computational Linguistics - Toronto, Kanada
Dauer: 09.07.202314.07.2023
Konferenznummer: 61
https://2023.aclweb.org

Bibliographische Notiz

Publisher Copyright:
© 2023 Association for Computational Linguistics.

Zuletzt angesehen

Publikationen

  1. The relationship between values and knowledge in visioning for landscape management
  2. Health State Valuation Methods and Reference Points
  3. Experimental Verification of the Impact of Radial Internal Clearance on a Bearing's Dynamics
  4. Explorations in social spaces
  5. Organizational practices for the aging workforce
  6. “Circuits of Commons”: Exploring the Connections Between Economic Lives and the Commons
  7. Media coverage of discourse on adaptation
  8. Learning from partially annotated sequences
  9. Introduction
  10. Modelling ammonia emissions after field application of biogas slurries
  11. The Crowd in Flux
  12. Exchanging Knowledge and Good Practices of Education for Sustainable Development within a Global Student Organization (oikos)
  13. Finite element modeling of laser beam welding for residual stress calculation
  14. The creation and analysis of employer-employee matched data, ed. by John C. Haltiwanger ...
  15. Introduction to the Special Issue Section
  16. Safer Spaces
  17. On the Direct Kinematics Problem of Parallel Mechanisms
  18. Using measures of reading time regularity (RTR) to quantify eye movement dynamics, and how they are shaped by linguistic information
  19. Embedding Evidence on Conservation Interventions Within a Context of Multilevel Governance
  20. Discussion report part 2
  21. Careless responding detection revisited
  22. Changes in the Governance of Garment Global Production Networks
  23. Circular and inclusive utilization of alternative proteins
  24. Programmed Visions
  25. Introduction
  26. Impact assessment of emissions stabilization scenarios with and without induced technological change
  27. A flexible global warming index for use in an integrated approach to climate change assessment