QALD-9-plus: A Multilingual Dataset for Question Answering over DBpedia and Wikidata Translated by Native Speakers
Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review
Authors
The ability to have the same experience for different user groups (i.e., accessibility) is one of the most important characteristics of Web-based systems. The same is true for Knowledge Graph Question Answering (KGQA) systems that provide the access to Semantic Web data via natural language interface. While following our research agenda on the multilingual aspect of accessibility of KGQA systems, we identified several ongoing challenges. One of them is the lack of multilingual KGQA benchmarks. In this work, we extend one of the most popular KGQA benchmarks - QALD-9 by introducing high-quality questions' translations to 8 languages provided by native speakers, and transferring the SPARQL queries of QALD-9 from DBpedia to Wikidata, s.t., the usability and relevance of the dataset is strongly increased. Five of the languages - Armenian, Ukrainian, Lithuanian, Bashkir and Belarusian - to our best knowledge were never considered in KGQA research community before. The latter two of the languages are considered as 'endangered' by UNESCO. We call the extended dataset QALD-9-plus and made it available online11Figshare: https://doi.org/10.6084/m9.figshare.16864273. GitHub: https://github.com/Perevalov/qald-9-plus.
Original language | English |
---|---|
Title of host publication | Proceedings - 16th IEEE International Conference on Semantic Computing, ICSC 2022 |
Number of pages | 6 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Publication date | 2022 |
Pages | 229-234 |
ISBN (print) | 978-1-6654-3419-5 |
ISBN (electronic) | 978-1-6654-3418-8 |
DOIs | |
Publication status | Published - 2022 |
Externally published | Yes |
Event | 16th IEEE International Conference on Semantic Computing, ICSC 2022 - Virtual, Online, United States Duration: 26.01.2022 → 28.01.2022 http://pa.icar.cnr.it/scsn22/ |
Bibliographical note
Publisher Copyright:
© 2022 IEEE.
- multilingual question answering, question answering dataset, question answering over knowledge graphs
- Informatics
- Business informatics