Modern Baselines for SPARQL Semantic Parsing

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

In this work, we focus on the task of generating SPARQL queries from natural language questions, which can then be executed on Knowledge Graphs (KGs). We assume that gold entity and relations have been provided, and the remaining task is to arrange them in the right order along with SPARQL vocabulary, and input tokens to produce the correct SPARQL query. Pre-trained Language Models (PLMs) have not been explored in depth on this task so far, so we experiment with BART, T5 and PGNs (Pointer Generator Networks) with BERT embeddings, looking for new baselines in the PLM era for this task, on DBpedia and Wikidata KGs. We show that T5 requires special input tokenisation, but produces state of the art performance on LC-QuAD 1.0 and LC-QuAD 2.0 datasets, and outperforms task-specific models from previous works. Moreover, the methods enable semantic parsing for questions where a part of the input needs to be copied to the output query, thus enabling a new paradigm in KG semantic parsing.

Original languageEnglish
Title of host publicationSIGIR 2022 - Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
EditorsEnrique Amigo, Pablo Castells, Julio Gonzalo
Number of pages6
PublisherAssociation for Computing Machinery, Inc
Publication date07.07.2022
Pages2260-2265
ISBN (electronic)978-1-4503-8732-3
DOIs
Publication statusPublished - 07.07.2022
Externally publishedYes
Event45th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR 2022 - Online + Círculo de Bellas Artes (Circle of Beaux Arts), Madrid, Spain
Duration: 11.07.202215.07.2022
Conference number: 45
https://sigir.org/sigir2022/

Bibliographical note

Publisher Copyright:
© 2022 ACM.

DOI

Recently viewed

Projects

  1. Nutrient Network

Publications

  1. WHICH ESTIMATION SITUATIONS ARE RELEVANT FOR A VALID ASSESSMENT OF MEASUREMENT ESTIMATION SKILLS
  2. Repeat Receipts: A device for generating visible data in market research focus groups
  3. Improve a 3D distance measurement accuracy in stereo vision systems using optimization methods’ approach
  4. Species constancy depends on plot size - A problem for vegetation classification and how it can be solved
  5. Chapter 9: Particular Remedies for Non-performance: Section 2: Withholding Performance
  6. A Comparative Study for Fisheye Image Classification
  7. Creating regional (e-)learning networks
  8. Perfectly nested or significantly nested - an important difference for conservation management
  9. Beyond Path Dependency
  10. An analytical approach to evaluating monotonic functions of fuzzy numbers
  11. Using Language Learning Resources on YouTube
  12. Introduction: The representative turn in EU studies
  13. Twitter and its usage for dialogic stakeholder communication by MNCs and NGOs
  14. Doing space in face-to-face interaction and on interactive multimodal platforms
  15. Using Digitalization As An Enabler For Changeability In Production Systems In A Learning Factory Environment
  16. Use of design methods, team leaders' goal orientation, and team effectiveness: A follow-up study in software development projects
  17. Determinants and Outcomes of Dual Distribution:
  18. Control oriented modeling of DCDC converters
  19. Optimal trajectory generation for camless internal combustion engine valve control
  20. Noise level estimation and detection
  21. Differentiating forest types using TerraSAR–X spotlight images based on inferential statistics and multivariate analysis
  22. Designing a Thrifty Approach for SME Business Continuity: Practices for Transparency of the Design Process
  23. How Much Home Office is Ideal? A Multi-Perspective Algorithm
  24. Language and Mathematics - Key Factors influencing the Comprehension Process in reality-based Tasks
  25. Handicaps in job assignment
  26. Modellieren in der Sekundarstufe
  27. Phosphorus uptake from struvite is modulated by the nitrogen form applied
  28. The identification of up-And downstream industries using input-output tables and a firm-level application to minority shareholdings
  29. Earnings Less Risk-Free Interest Charge (ERIC) and Stock Returns—A Value-Based Management Perspective on ERIC’s Relative and Incremental Information Content
  30. The effect of yield surface curvature change by cross hardening on forming limit diagrams of sheets