Benchmarking question answering systems

Ricardo Usbeck; Michael Röder; Michael Hoffmann; Felix Conrads; Jonathan Huthmann; Axel Cyrille Ngonga-Ngomo; Christian Demmler; Christina Unger

doi:10.3233/SW-180312

Benchmarking question answering systems

Publikation: Beiträge in Zeitschriften › Zeitschriftenaufsätze › Forschung › begutachtet

Standard

Benchmarking question answering systems. / Usbeck, Ricardo; Röder, Michael; Hoffmann, Michael et al.
in: Semantic Web, Jahrgang 10, Nr. 2, 2019, S. 293-304.

Publikation: Beiträge in Zeitschriften › Zeitschriftenaufsätze › Forschung › begutachtet

Harvard

Usbeck, R, Röder, M, Hoffmann, M, Conrads, F, Huthmann, J, Ngonga-Ngomo, AC, Demmler, C & Unger, C 2019, 'Benchmarking question answering systems', Semantic Web, Jg. 10, Nr. 2, S. 293-304. https://doi.org/10.3233/SW-180312

APA

Usbeck, R., Röder, M., Hoffmann, M., Conrads, F., Huthmann, J., Ngonga-Ngomo, A. C., Demmler, C., & Unger, C. (2019). Benchmarking question answering systems. Semantic Web, 10(2), 293-304. https://doi.org/10.3233/SW-180312

Vancouver

Usbeck R, Röder M, Hoffmann M, Conrads F, Huthmann J, Ngonga-Ngomo AC et al. Benchmarking question answering systems. Semantic Web. 2019;10(2):293-304. doi: 10.3233/SW-180312

Bibtex

@article{e0fb50da754c419a9ba286e40e5e2387,

title = "Benchmarking question answering systems",

abstract = "The necessity of making the Semantic Web more accessible for lay users, alongside the uptake of interactive systems and smart assistants for the Web, have spawned a new generation of RDF-based question answering systems. However, fair evaluation of these systems remains a challenge due to the different type of answers that they provide. Hence, repeating current published experiments or even benchmarking on the same datasets remains a complex and time-consuming task. We present a novel online benchmarking platform for question answering (QA) that relies on the FAIR principles to support the fine-grained evaluation of question answering systems. We detail how the platform addresses the fair benchmarking platform of question answering systems through the rewriting of URIs and URLs. In addition, we implement different evaluation metrics, measures, datasets and pre-implemented systems as well as methods to work with novel formats for interactive and non-interactive benchmarking of question answering systems. Our analysis of current frameworks shows that most of the current frameworks are tailored towards particular datasets and challenges but do not provide generic models. In addition, while most frameworks perform well in the annotation of entities and properties, the generation of SPARQL queries from annotated text remains a challenge.",

keywords = "Benchmarking, Factoid question answering, Repeatable open research, Informatics, Business informatics",

author = "Ricardo Usbeck and Michael R{\"o}der and Michael Hoffmann and Felix Conrads and Jonathan Huthmann and Ngonga-Ngomo, {Axel Cyrille} and Christian Demmler and Christina Unger",

note = "The authors gratefully acknowledge financial support from the German Federal Ministry of Education and Research within Eurostars, a joint programme of EUREKA and the European Community under the project E!9367 DIESEL and E!9725 QAMEL as well as the European Union's H2020 research and innovation action HOBBIT (GA 688227). We thank the QANARY team for inspiring discussions. Furthermore, we want to thank Jin-Dong Kim for his thoughts on the novel QA format. We also want to acknowledge that this project has been supported by the BMVI projects LIMBO (project no. 19F2029C) and OPAL (project no. 19F20284) as well as by the German Federal Ministry of Education and Research (BMBF) within 'KMU-innovativ: Forschung f{\"u}r die zivile Sicherheit' in particular 'Forschung f{\"u}r die zivile Sicherheit' and the project SOLIDE (no. 13N14456). Publisher Copyright: {\textcopyright} 2019 - IOS Press and the authors. All rights reserved.",

year = "2019",

doi = "10.3233/SW-180312",

language = "English",

volume = "10",

pages = "293--304",

journal = "Semantic Web",

issn = "1570-0844",

publisher = "SAGE Publications Inc.",

number = "2",

}

RIS

TY - JOUR

T1 - Benchmarking question answering systems

AU - Usbeck, Ricardo

AU - Röder, Michael

AU - Hoffmann, Michael

AU - Conrads, Felix

AU - Huthmann, Jonathan

AU - Ngonga-Ngomo, Axel Cyrille

AU - Demmler, Christian

AU - Unger, Christina

N1 - The authors gratefully acknowledge financial support from the German Federal Ministry of Education and Research within Eurostars, a joint programme of EUREKA and the European Community under the project E!9367 DIESEL and E!9725 QAMEL as well as the European Union's H2020 research and innovation action HOBBIT (GA 688227). We thank the QANARY team for inspiring discussions. Furthermore, we want to thank Jin-Dong Kim for his thoughts on the novel QA format. We also want to acknowledge that this project has been supported by the BMVI projects LIMBO (project no. 19F2029C) and OPAL (project no. 19F20284) as well as by the German Federal Ministry of Education and Research (BMBF) within 'KMU-innovativ: Forschung für die zivile Sicherheit' in particular 'Forschung für die zivile Sicherheit' and the project SOLIDE (no. 13N14456). Publisher Copyright: © 2019 - IOS Press and the authors. All rights reserved.

PY - 2019

Y1 - 2019

N2 - The necessity of making the Semantic Web more accessible for lay users, alongside the uptake of interactive systems and smart assistants for the Web, have spawned a new generation of RDF-based question answering systems. However, fair evaluation of these systems remains a challenge due to the different type of answers that they provide. Hence, repeating current published experiments or even benchmarking on the same datasets remains a complex and time-consuming task. We present a novel online benchmarking platform for question answering (QA) that relies on the FAIR principles to support the fine-grained evaluation of question answering systems. We detail how the platform addresses the fair benchmarking platform of question answering systems through the rewriting of URIs and URLs. In addition, we implement different evaluation metrics, measures, datasets and pre-implemented systems as well as methods to work with novel formats for interactive and non-interactive benchmarking of question answering systems. Our analysis of current frameworks shows that most of the current frameworks are tailored towards particular datasets and challenges but do not provide generic models. In addition, while most frameworks perform well in the annotation of entities and properties, the generation of SPARQL queries from annotated text remains a challenge.

AB - The necessity of making the Semantic Web more accessible for lay users, alongside the uptake of interactive systems and smart assistants for the Web, have spawned a new generation of RDF-based question answering systems. However, fair evaluation of these systems remains a challenge due to the different type of answers that they provide. Hence, repeating current published experiments or even benchmarking on the same datasets remains a complex and time-consuming task. We present a novel online benchmarking platform for question answering (QA) that relies on the FAIR principles to support the fine-grained evaluation of question answering systems. We detail how the platform addresses the fair benchmarking platform of question answering systems through the rewriting of URIs and URLs. In addition, we implement different evaluation metrics, measures, datasets and pre-implemented systems as well as methods to work with novel formats for interactive and non-interactive benchmarking of question answering systems. Our analysis of current frameworks shows that most of the current frameworks are tailored towards particular datasets and challenges but do not provide generic models. In addition, while most frameworks perform well in the annotation of entities and properties, the generation of SPARQL queries from annotated text remains a challenge.

KW - Benchmarking

KW - Factoid question answering

KW - Repeatable open research

KW - Informatics

KW - Business informatics

UR - http://www.scopus.com/inward/record.url?scp=85060906490&partnerID=8YFLogxK

U2 - 10.3233/SW-180312

DO - 10.3233/SW-180312

M3 - Journal articles

AN - SCOPUS:85060906490

VL - 10

SP - 293

EP - 304

JO - Semantic Web

JF - Semantic Web

SN - 1570-0844

IS - 2

ER -

In der gleichen Zeitschrift

QALD-10 — The 10th Challenge on Question Answering over Linked Data: Shifting from DBpedia to Wikidata as a KG for KGQA

Usbeck, R., Yan, X., Perevalov, A., Jiang, L., Schulz, J., Kraft, A., Möller, C., Huang, J., Reineke, J., Ngomo, A.-C. N., Saleem, M. & Both, A., 2023, in: Semantic Web. 15, 6, S. 2193-2207 14 S.

Publikation: Beiträge in Zeitschriften › Zeitschriftenaufsätze › Forschung › begutachtet

Survey on English Entity Linking on Wikidata: Datasets and approaches

Möller, C., Lehmann, J. & Usbeck, R., 26.09.2022, in: Semantic Web. 13, 6, S. 925-966 42 S.

Publikation: Beiträge in Zeitschriften › Zeitschriftenaufsätze › Forschung › begutachtet

Gerbil – Benchmarking named entity recognition and linking consistently

Röder, M., Usbeck, R. & Ngonga Ngomo, A. C., 2018, in: Semantic Web. 9, 5

Publikation: Beiträge in Zeitschriften › Zeitschriftenaufsätze › Forschung › begutachtet

Survey on challenges of Question Answering in the Semantic Web

Höffner, K., Walter, S., Marx, E., Usbeck, R., Lehmann, J. & Ngonga Ngomo, A. C., 07.08.2017, in: Semantic Web. 8, 6, S. 895-920 26 S.

Publikation: Beiträge in Zeitschriften › Zeitschriftenaufsätze › Forschung › begutachtet

Weitere Publikationen dieser Person(en)

ShortPathQA: A Dataset for Controllable Fusion of Large Language Models with Knowledge Graphs

Salnikov, M., Sakhovskiy, A., Nikishina, I., Usmanova, A., Kraft, A., Möller, C., Banerjee, D., Huang, J., Jiang, L., Abdullah, R., Yan, X., Tutubalina, E., Usbeck, R. & Panchenko, A., 2026, Natural Language Processing and Information Systems: 30th International Conference on Applications of Natural Language to Information Systems, NLDB 2025, Proceedings. Ichise, R. (Hrsg.). Springer Science and Business Media Deutschland, S. 95-110 16 S. (Lecture Notes in Computer Science; Band 15836 LNCS).

Publikation: Beiträge in Sammelwerken › Aufsätze in Konferenzbänden › Forschung › begutachtet

Analyzing the Influence of Knowledge Graph Information on Relation Extraction

Möller, C. & Usbeck, R., 2025, The Semantic Web: 22nd European Semantic Web Conference, ESWC 2025 Portoroz, Slovenia, June 1–5, 2025 Proceedings, Part I. Curry, E., Acosta, M., Poveda-Villalón, M., van Erp, M., Ojo, A., Hose, K., Shimizu, C. & Lisena, P. (Hrsg.). Cham: Springer Nature Switzerland AG, Band 1. S. 460-480 21 S. (Lecture Notes in Computer Science ; Band 15718).

Publikation: Beiträge in Sammelwerken › Aufsätze in Konferenzbänden › Forschung › begutachtet

ASK-DBLP: Answering Questions over DBLP

Taffa, T., Neises, P., Ollinger, S., Westphal, P., Ackermann, M. R., Banerjee, D. & Usbeck, R., 02.11.2025, ISWC-C 2025, Industry, Doctoral Consortium, Posters and Demos at ISWC 2025: Joint Proceedings of Industry, Doctoral Consortium, Posters and Demos of the 24th International Semantic Web Conference (ISWC-C 2025), ISWC 2025 Companion Volume. Celino, I., Hassanzadeh, O., Bernstein, A., Noy, N., Cheng, G., Wang, S., Ferrada, S., Soulard, T., Kozaki, K., Takeda, H. & Gentile, A. L. (Hrsg.). Aachen: Sun Site Central Europe (RWTH Aachen University), S. 435-440 6 S. D13. (CEUR Workshop Proceedings; Band 4085).

Publikation: Beiträge in Sammelwerken › Aufsätze in Konferenzbänden › Forschung › begutachtet

Automating SPARQL Query Translations between DBpedia and Wikidata

Bartels, M. C., Banerjee, D. & Usbeck, R., 14.07.2025, Linking Meaning: Semantic Technologies Shaping the Future of AI: Cover 74617 Proceedings of the 21st International Conference on Semantic Systems, 3-5 September 2025, Vienna, Austria. Spahiu, B., Vahdati, S., Salatino, A., Pellegrini, T. & Havur, G. (Hrsg.). IOS Press BV, S. 176-193 18 S. (Studies on the Semantic Web; Band 62).

Publikation: Beiträge in Sammelwerken › Aufsätze in Konferenzbänden › Forschung

Best Practices in AI and Data Science Models Evaluation

Banerjee, D., Taffa, T. A. & Usbeck, R., 2025, INFORMATIK 2025 : The Wide Open - Offenheit von Source bis Science, 16.-19.September 2025 Potsdam. Lucke, U., Stieglitz, S., Uebernickel, F., Lamprecht, A.-L. & Klein, M. (Hrsg.). Bonn: Gesellschaft für Informatik e.V., S. 1211-1219 9 S. (Lecture Notes in Informatics; Band P366).

Publikation: Beiträge in Sammelwerken › Aufsätze in Konferenzbänden › Forschung › begutachtet

DOI

https://doi.org/10.3233/SW-180312
Endgültige, publizierte Fassung