QALD-10 — The 10th Challenge on Question Answering over Linked Data: Shifting from DBpedia to Wikidata as a KG for KGQA

Research output: Journal contributionsJournal articlesResearchpeer-review

Standard

QALD-10 — The 10th Challenge on Question Answering over Linked Data: Shifting from DBpedia to Wikidata as a KG for KGQA. / Usbeck, Ricardo; Yan, Xi; Perevalov, Aleksandr et al.
In: Semantic Web, Vol. 15, No. 6, 2023, p. 2193-2207.

Research output: Journal contributionsJournal articlesResearchpeer-review

Harvard

Usbeck, R, Yan, X, Perevalov, A, Jiang, L, Schulz, J, Kraft, A, Möller, C, Huang, J, Reineke, J, Ngomo, A-CN, Saleem, M & Both, A 2023, 'QALD-10 — The 10th Challenge on Question Answering over Linked Data: Shifting from DBpedia to Wikidata as a KG for KGQA', Semantic Web, vol. 15, no. 6, pp. 2193-2207. https://doi.org/10.3233/SW-233471

APA

Usbeck, R., Yan, X., Perevalov, A., Jiang, L., Schulz, J., Kraft, A., Möller, C., Huang, J., Reineke, J., Ngomo, A.-C. N., Saleem, M., & Both, A. (2023). QALD-10 — The 10th Challenge on Question Answering over Linked Data: Shifting from DBpedia to Wikidata as a KG for KGQA. Semantic Web, 15(6), 2193-2207. https://doi.org/10.3233/SW-233471

Vancouver

Bibtex

@article{717c33cb84e64001a853e5366c53693f,
title = "QALD-10 — The 10th Challenge on Question Answering over Linked Data: Shifting from DBpedia to Wikidata as a KG for KGQA",
abstract = "Knowledge Graph Question Answering (KGQA) has gained attention from both industry and academia over the past decade. Researchers proposed a substantial amount of benchmarking datasets with different properties, pushing the development in this field forward. Many of these benchmarks depend on Freebase, DBpedia, or Wikidata. However, KGQA benchmarks that depend on Freebase and DBpedia are gradually less studied and used, because Freebase is defunct and DBpedia lacks the structural validity of Wikidata. Therefore, research is gravitating toward Wikidata-based benchmarks. That is, new KGQA benchmarks are created on the basis of Wikidata and existing ones are migrated. We present a new, multilingual, complex KGQA benchmarking dataset as the 10th part of the Question Answering over Linked Data (QALD) benchmark series. This corpus formerly depended on DBpedia. Since QALD serves as a base for many machine-generated benchmarks, we increased the size and adjusted the benchmark to Wikidata and its ranking mechanism of properties. These measures foster novel KGQA developments by more demanding benchmarks. Creating a benchmark from scratch or migrating it from DBpedia to Wikidata is non-trivial due to the complexity of the Wikidata knowledge graph, mapping issues between different languages, and the ranking mechanism of properties using qualifiers. We present our creation strategy and the challenges we faced that will assist other researchers in their future work. Our case study, in the form of a conference challenge, is accompanied by an in-depth analysis of the created benchmark.",
keywords = "Informatics",
author = "Ricardo Usbeck and Xi Yan and Aleksandr Perevalov and Longquan Jiang and Julius Schulz and Angelie Kraft and Cedric M{\"o}ller and Junbo Huang and Jan Reineke and Ngomo, {Axel-Cyrille Ngonga} and Muhammad Saleem and Andreas Both",
year = "2023",
doi = "10.3233/SW-233471",
language = "English",
volume = "15",
pages = "2193--2207",
journal = "Semantic Web",
issn = "1570-0844",
publisher = "SAGE Publications Inc.",
number = "6",

}

RIS

TY - JOUR

T1 - QALD-10 — The 10th Challenge on Question Answering over Linked Data

T2 - Shifting from DBpedia to Wikidata as a KG for KGQA

AU - Usbeck, Ricardo

AU - Yan, Xi

AU - Perevalov, Aleksandr

AU - Jiang, Longquan

AU - Schulz, Julius

AU - Kraft, Angelie

AU - Möller, Cedric

AU - Huang, Junbo

AU - Reineke, Jan

AU - Ngomo, Axel-Cyrille Ngonga

AU - Saleem, Muhammad

AU - Both, Andreas

PY - 2023

Y1 - 2023

N2 - Knowledge Graph Question Answering (KGQA) has gained attention from both industry and academia over the past decade. Researchers proposed a substantial amount of benchmarking datasets with different properties, pushing the development in this field forward. Many of these benchmarks depend on Freebase, DBpedia, or Wikidata. However, KGQA benchmarks that depend on Freebase and DBpedia are gradually less studied and used, because Freebase is defunct and DBpedia lacks the structural validity of Wikidata. Therefore, research is gravitating toward Wikidata-based benchmarks. That is, new KGQA benchmarks are created on the basis of Wikidata and existing ones are migrated. We present a new, multilingual, complex KGQA benchmarking dataset as the 10th part of the Question Answering over Linked Data (QALD) benchmark series. This corpus formerly depended on DBpedia. Since QALD serves as a base for many machine-generated benchmarks, we increased the size and adjusted the benchmark to Wikidata and its ranking mechanism of properties. These measures foster novel KGQA developments by more demanding benchmarks. Creating a benchmark from scratch or migrating it from DBpedia to Wikidata is non-trivial due to the complexity of the Wikidata knowledge graph, mapping issues between different languages, and the ranking mechanism of properties using qualifiers. We present our creation strategy and the challenges we faced that will assist other researchers in their future work. Our case study, in the form of a conference challenge, is accompanied by an in-depth analysis of the created benchmark.

AB - Knowledge Graph Question Answering (KGQA) has gained attention from both industry and academia over the past decade. Researchers proposed a substantial amount of benchmarking datasets with different properties, pushing the development in this field forward. Many of these benchmarks depend on Freebase, DBpedia, or Wikidata. However, KGQA benchmarks that depend on Freebase and DBpedia are gradually less studied and used, because Freebase is defunct and DBpedia lacks the structural validity of Wikidata. Therefore, research is gravitating toward Wikidata-based benchmarks. That is, new KGQA benchmarks are created on the basis of Wikidata and existing ones are migrated. We present a new, multilingual, complex KGQA benchmarking dataset as the 10th part of the Question Answering over Linked Data (QALD) benchmark series. This corpus formerly depended on DBpedia. Since QALD serves as a base for many machine-generated benchmarks, we increased the size and adjusted the benchmark to Wikidata and its ranking mechanism of properties. These measures foster novel KGQA developments by more demanding benchmarks. Creating a benchmark from scratch or migrating it from DBpedia to Wikidata is non-trivial due to the complexity of the Wikidata knowledge graph, mapping issues between different languages, and the ranking mechanism of properties using qualifiers. We present our creation strategy and the challenges we faced that will assist other researchers in their future work. Our case study, in the form of a conference challenge, is accompanied by an in-depth analysis of the created benchmark.

KW - Informatics

U2 - 10.3233/SW-233471

DO - 10.3233/SW-233471

M3 - Journal articles

VL - 15

SP - 2193

EP - 2207

JO - Semantic Web

JF - Semantic Web

SN - 1570-0844

IS - 6

ER -

Recently viewed

Publications

  1. Learning to rank user intent
  2. Model-based estimation of pesticides and transformation products and their export pathways in a headwater catchment
  3. Managing sustainable development with management control systems
  4. Credit constraints and exports
  5. Empathy as a motivator of dyadic helping across group boundaries
  6. Urban Problem Discourses
  7. Using density surface models to assess the ecological effectiveness of a protected area network in Tanzania
  8. Do consumers prefer pasture-raised dual-purpose cattle when considering meat products? A hypothetical discrete choice experiment for the case of minced beef
  9. Vom „rights-based approach" zum "solution-based approach" in der WTO-Streitbeilegung?
  10. Calibrated Passive Sampling - Multi-plot Field Measurements of NH3 Emissions with a Combination of Dynamic Tube Method and Passive Samplers
  11. The Balanced Scorecard and different Business Models in the textile industry
  12. Learning-related emotions in multimedia learning
  13. Cascade MIMO P-PID Controllers Applied in an Over-actuated Quadrotor Tilt-Rotor
  14. Collaborative modelling for active involvement of stakeholders in urban flood risk management
  15. Mechanical properties and microstructures of nano SiC reinforced ZE10 composites prepared with ultrasonic vibration
  16. A cross-scale assessment of productivity–diversity relationships
  17. On the logic of drawing history from symbols, especially from images
  18. Making sense of sustainability transitions locally
  19. Effects of season and man-made changes on baseflow and flow recession
  20. Formative assessment in mathematics
  21. A web- And mobile-based intervention for comorbid, recurrent depression in patients with chronic back pain on sick leave (get.back)
  22. Developing a die casting magnesium alloy with excellent mechanical performance by controlling intermetallic phase
  23. Mechanics of sheet-bulk indentation
  24. Navigating tensions in inclusive conservation
  25. Epistemic Domination by Data Extraction
  26. How work values relate to the intention to work after retirement
  27. Feedforward and repetitive control of a servo piezo-mechanical hydraulic actuator
  28. Sustainable Development and Quality Assurance in Higher Education
  29. Testing for Economies of Scope in European Railways
  30. Model Predictive Control for Energy Optimization in Generators/Motors as Well as Converters and Inverters for Futuristic Integrated Power Networks
  31. CaO dissolution during melting and solidification of a Mg-10 wt.% CaO alloy detected with in situ synchrotron radiation diffraction