QALD-9-ES: A Spanish Dataset for Question Answering Systems

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschungbegutachtet

Authors

Knowledge Graph Question Answering (KGQA) systems enable access to semantic information for any user who can compose a question in natural language. KGQA systems are now a core component of many industrial applications, including chatbots and conversational search applications. Although distinct worldwide cultures speak different languages, the number of languages covered by KGQA systems and its resources is mainly limited to English. To implement KGQA systems worldwide, we need to expand the current KGQA resources to languages other than English. Taking into account the recent popularity that LargeScale Language Models are receiving, we believe that providing quality resources is key to the development of future pipelines. One of these resources is the datasets used to train and test KGQA systems. Among the few multilingual KGQA datasets available, only one covers Spanish, i.e., QALD-9. We reviewed the Spanish translations in the QALD-9 dataset and confirmed several issues that may affect the KGQA system’s quality. Taking this into account, we created new Spanish translations for this dataset and reviewed them manually with the help of native speakers. This dataset provides newly created, high-quality translations for QALD-9; we call this extension QALD-9-ES. We merged these translations into the QALD-9-plus dataset, which provides trustworthy native translations for QALD-9 in nine languages, intending to create one complete source of high-quality translations. We compared the new translations with the QALD-9 original ones using Languageagnostic quantitative text analysis measures and found improvements in the results of the new translations. Finally, we compared both translations using the GERBIL QA benchmark framework using a KGQA system that supports Spanish. Although the question-answering scores only improved slightly, we believe that improving the quality of the existing translations will result in better KGQA systems and therefore increase the applicability of KGQA w.r.t. the Spanish language domain.
OriginalspracheEnglisch
TitelKnowledge Graphs: Semantics, Machine Learning, and Languages : Proceedings of the 19th International Conference on Semantic Systems, 20-22 September 2023, Leipzig, Germany
HerausgeberMaribel Acosta, Silvio Peroni, Sahar Vahdati, Anna Lisa Gentile, Tassilo Pellegrini, Jan-Christoph Kalo
Anzahl der Seiten15
ErscheinungsortAmsterdam
VerlagIOS Press BV
Erscheinungsdatum11.09.2023
Seiten38-52
ISBN (Print)978-1-64368-424-6
ISBN (elektronisch)978-1-64368-425-3
DOIs
PublikationsstatusErschienen - 11.09.2023
Extern publiziertJa
Veranstaltung19th International Conference on Semantic Systems - HYPERION Hotel Leipzig , Leipzig, Deutschland
Dauer: 20.09.202322.09.2023
Konferenznummer: 19
https://2023-eu.semantics.cc

DOI

Zuletzt angesehen

Publikationen

  1. Das Anfertigen von Notizen als Lernstrategie beim mathematischen Modellieren
  2. Resilience, Entrepreneurship and ICT
  3. Capacity building for transformational leadership and transdisciplinarity
  4. Importance of scrub-pastureland mosaics for wild-living cats occurrence in a Mediterranean area: Implications for the conservation of the wildcat (Felis silvestris)
  5. 再生可能エネルギー促進に向けたドイツの法的ステップ
  6. Students' perspectives on wheelchair basketball in mainstream and special schools
  7. Affective events and proactivity
  8. Re-Introducing Walther Schücking
  9. Foundations of Management & Entrepreneurship
  10. Effects of a temporary asymmetrical occlusion block on upper body posture
  11. archiDART v3.0
  12. The hidden power of language
  13. Local and global mechanical properties of orbital friction stir welding on API X65 PSL2 steel / Inconel 625 clad pipes
  14. P : Passivität
  15. Feedback on creative ideas
  16. Students' conceptions about the sense of smell
  17. Commentary to article 1
  18. Effects of an online- and video-based learning environment on pre-service teachers’ self-efficacy beliefs, attitudes towards inclusion and knowledge of inclusive education during practical school experiences
  19. Multitrophic effects of experimental changes in plant diversity on cavity-nesting bees, wasps, and their parasitoids
  20. Die Reregulierung der Versicherungsvermittler
  21. Board gender diversity and carbon emissions
  22. Perspective as Practice. Renaissance Cultures of Optics, (About the development of optics and perspective between the fifteenth and seventeenth centuries) (TECHNE 1) Dupré, Sven (ed.): Brepols, Turnhout 2019
  23. You could be lucky
  24. Unterrichtsqualität an Hamburger Grundschulen
  25. Fehr on Human Altruism (Editorial)
  26. Gamification
  27. Das „historisch neue Vernunftprinzip der Emanzipation“
  28. Auch Reiter müssen fit sein!
  29. Objektaffekte