Modern Baselines for SPARQL Semantic Parsing
Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review
Standard
SIGIR 2022 - Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. ed. / Enrique Amigo; Pablo Castells; Julio Gonzalo. Association for Computing Machinery, Inc, 2022. p. 2260-2265 (SIGIR 2022 - Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval).
Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review
Harvard
APA
Vancouver
Bibtex
}
RIS
TY - CHAP
T1 - Modern Baselines for SPARQL Semantic Parsing
AU - Banerjee, Debayan
AU - Nair, Pranav Ajit
AU - Kaur, Jivat Neet
AU - Usbeck, Ricardo
AU - Biemann, Chris
N1 - Conference code: 45
PY - 2022/7/6
Y1 - 2022/7/6
N2 - In this work, we focus on the task of generating SPARQL queries from natural language questions, which can then be executed on Knowledge Graphs (KGs). We assume that gold entity and relations have been provided, and the remaining task is to arrange them in the right order along with SPARQL vocabulary, and input tokens to produce the correct SPARQL query. Pre-trained Language Models (PLMs) have not been explored in depth on this task so far, so we experiment with BART, T5 and PGNs (Pointer Generator Networks) with BERT embeddings, looking for new baselines in the PLM era for this task, on DBpedia and Wikidata KGs. We show that T5 requires special input tokenisation, but produces state of the art performance on LC-QuAD 1.0 and LC-QuAD 2.0 datasets, and outperforms task-specific models from previous works. Moreover, the methods enable semantic parsing for questions where a part of the input needs to be copied to the output query, thus enabling a new paradigm in KG semantic parsing.
AB - In this work, we focus on the task of generating SPARQL queries from natural language questions, which can then be executed on Knowledge Graphs (KGs). We assume that gold entity and relations have been provided, and the remaining task is to arrange them in the right order along with SPARQL vocabulary, and input tokens to produce the correct SPARQL query. Pre-trained Language Models (PLMs) have not been explored in depth on this task so far, so we experiment with BART, T5 and PGNs (Pointer Generator Networks) with BERT embeddings, looking for new baselines in the PLM era for this task, on DBpedia and Wikidata KGs. We show that T5 requires special input tokenisation, but produces state of the art performance on LC-QuAD 1.0 and LC-QuAD 2.0 datasets, and outperforms task-specific models from previous works. Moreover, the methods enable semantic parsing for questions where a part of the input needs to be copied to the output query, thus enabling a new paradigm in KG semantic parsing.
KW - knowledge graph
KW - question answering
KW - semantic parsing
KW - sparql
KW - Business informatics
KW - Informatics
UR - http://www.scopus.com/inward/record.url?scp=85135063193&partnerID=8YFLogxK
U2 - 10.1145/3477495.3531841
DO - 10.1145/3477495.3531841
M3 - Article in conference proceedings
AN - SCOPUS:85135063193
T3 - SIGIR 2022 - Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
SP - 2260
EP - 2265
BT - SIGIR 2022 - Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
A2 - Amigo, Enrique
A2 - Castells, Pablo
A2 - Gonzalo, Julio
PB - Association for Computing Machinery, Inc
T2 - 45th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR 2022
Y2 - 11 July 2022 through 15 July 2022
ER -