Benchmarking question answering systems

Research output: Journal contributionsJournal articlesResearchpeer-review

Authors

  • Ricardo Usbeck
  • Michael Röder
  • Michael Hoffmann
  • Felix Conrads
  • Jonathan Huthmann
  • Axel Cyrille Ngonga-Ngomo
  • Christian Demmler
  • Christina Unger

The necessity of making the Semantic Web more accessible for lay users, alongside the uptake of interactive systems and smart assistants for the Web, have spawned a new generation of RDF-based question answering systems. However, fair evaluation of these systems remains a challenge due to the different type of answers that they provide. Hence, repeating current published experiments or even benchmarking on the same datasets remains a complex and time-consuming task. We present a novel online benchmarking platform for question answering (QA) that relies on the FAIR principles to support the fine-grained evaluation of question answering systems. We detail how the platform addresses the fair benchmarking platform of question answering systems through the rewriting of URIs and URLs. In addition, we implement different evaluation metrics, measures, datasets and pre-implemented systems as well as methods to work with novel formats for interactive and non-interactive benchmarking of question answering systems. Our analysis of current frameworks shows that most of the current frameworks are tailored towards particular datasets and challenges but do not provide generic models. In addition, while most frameworks perform well in the annotation of entities and properties, the generation of SPARQL queries from annotated text remains a challenge.

Original languageEnglish
JournalSemantic Web
Volume10
Issue number2
Pages (from-to)293-304
Number of pages12
ISSN1570-0844
DOIs
Publication statusPublished - 2019
Externally publishedYes

Bibliographical note

The authors gratefully acknowledge financial support from the German Federal Ministry of Education and Research within Eurostars, a joint programme of EUREKA and the European Community under the project E!9367 DIESEL and E!9725 QAMEL as well as the European Union's H2020 research and innovation action HOBBIT (GA 688227). We thank the QANARY team for inspiring discussions. Furthermore, we want to thank Jin-Dong Kim for his thoughts on the novel QA format. We also want to acknowledge that this project has been supported by the BMVI projects LIMBO (project no. 19F2029C) and OPAL (project no. 19F20284) as well as by the German Federal Ministry of Education and Research (BMBF) within 'KMU-innovativ: Forschung für die zivile Sicherheit' in particular 'Forschung für die zivile Sicherheit' and the project SOLIDE (no. 13N14456).

Publisher Copyright:
© 2019 - IOS Press and the authors. All rights reserved.

DOI

Recently viewed

Researchers

  1. Hans-Joachim Plewig

Publications

  1. Generic functions of railway stations
  2. Reconfiguring Desecuritization
  3. The language of situated joint activity: Social virtual reality and language learning in virtual exchange
  4. Increasing skepticism toward potential liars
  5. Individual differences and cognitive load theory
  6. Negotiating boundaries through reality shows
  7. How to specify the structure of substituted blade-like zigzag diamondoids
  8. Embedded, not plugged-in
  9. Lab-scale experiment of a closed thermochemical heat storage system including honeycomb heat exchanger
  10. Aligning the design of intermediary organisations with the ecosystem
  11. Integrating methods for ecosystem service assessment
  12. Comparative observations, empirical findings and research perspectives
  13. A Model Based Feedforward Regulator Improving PI Control of an Ice-Clamping Device Activated by Thermoelectric Cooler
  14. Cyclic and non-cyclic crew rostering problems in public bus transit
  15. Uncertainty, Pluralism, and the Knowledge-based Theory of the Firm
  16. Situated Institutions: The Role of Place, Space and Embeddedness in Institutional Dynamics
  17. Exploring the Capacity of Water Framework Directive Indices to Assess Ecosystem Services in Fluvial and Riparian Systems
  18. Panel Cointegration Testing in the Presence of a Time Trend
  19. Introduction - Teaching Artistic Strategies. Playing with Materiality, Aesthetics and Ambiguity
  20. Common Ground and Development
  21. Determinants and consequences of clawback provisions in management compensation contracts
  22. Effects of strategy instructions on learning from text and pictures
  23. Natality ‒ Philosophical Rudiments concerning a Generative Phenomenology
  24. Cross-Channel Real-Time Response Analysis
  25. States of Comparability
  26. Sustainability and management control. Exploring and theorizing control patterns in large European firms
  27. Informatik