Automating SPARQL Query Translations between DBpedia and Wikidata

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearch

Authors

Purpose:

This paper investigates whether state-of-the-art Large Language Models (LLMs) can automatically translate SPARQL between popular Knowledge Graph (KG) schemas. We focus on translations between the DBpedia and Wikidata KG, and later on DBLP and OpenAlex KG. This study addresses a notable gap in KG interoperability research by evaluating LLM performance on SPARQL-to-SPARQL translation.
Methodology:

Two benchmarks are assembled, where the first aligns 100 DBpedia–Wikidata queries from QALD-9-Plus dataset; the second contains 100 DBLP queries aligned to OpenAlex, testing generalizability beyond encyclopaedic KGs. Three open LLMs: Llama-3-8B, DeepSeek-R1-Distill-Llama-70B, and Mistral-Large-Instruct-2407 are selected based on their sizes and architectures and tested with zero-shot, few-shot, and two chain-of-thought variants. Outputs were compared with gold-standard answers, and resulting errors were systematically categorized.
Findings:

We find that the performance varies markedly across models and prompting strategies, and that translations for Wikidata to DBpedia work far better than translations for DBpedia to Wikidata. The largest model, Mistral-Large-Instruct-2407, achieved the highest accuracy, reaching 86% on the Wikidata → DBpedia task using a Chain-of-Thought approach. This performance was replicated in the DBLP → OpenAlex generalization task, which achieved similar results with a few- shot setup, underscoring the critical role of in-context examples.
Value:

This study demonstrates a viable and scalable pathway toward KG interoperability by using LLMs with structured prompting and explicit schema-mapping tables to translate queries across heterogeneous KGs. The method’s strong performance when applied to general purpose KGs and specialized scholarly domain suggests its potential as a promising approach to reduce the manual effort required for cross-KG data integration and analysis.
Original languageEnglish
Title of host publicationLinking Meaning: Semantic Technologies Shaping the Future of AI : Cover 74617 Proceedings of the 21st International Conference on Semantic Systems, 3-5 September 2025, Vienna, Austria
EditorsBlerina Spahiu, Sahar Vahdati, Angelo Salatino, Tassilo Pellegrini, Giray Havur
Number of pages18
PublisherIOS Press BV
Publication date14.07.2025
Pages176-193
ISBN (electronic)978-1-64368-616-5
DOIs
Publication statusPublished - 14.07.2025

Documents

DOI

Recently viewed

Publications

  1. Multiobjective optimal control of fluid mixing
  2. The Forgotten Function of Forgetting
  3. Comparison of Software Tools for Liquid Chromatography-High-Resolution Mass Spectrometry Data Processing in Nontarget Screening of Environmental Samples
  4. Exploring transition research as transformative science
  5. Using a Seminorm for Wavelet Denoising of sEMG Signals for Monitoring during Rehabilitation with Embedded Orthosis System
  6. Where pragmatics and dialectology meet: Introducing variational pragmatics
  7. The Dialectics of Open Access
  8. Group formation in computer-supported collaborative learning
  9. HR practices and ambidexterity in small- and medium-sized consulting firms: An exploratory multi-case study
  10. Development and Validation of a Us and German Short Version of the Later Life Workplace Index (llwi- S)
  11. Introduction
  12. Tree species and genetic diversity increase productivity via functional diversity and trophic feedbacks
  13. A geometric approach to the decoupling control and to speed up the dynamics of a general rigid body manipulation system
  14. The persistence of subsistence and the limits to development studies
  15. Project and Design of a Catamaran Prototype with Aerial Propulsion System
  16. The Mobile Phone: From an Instrument of Microcoordination to a Universal Control Device
  17. Mapping Amazon's logistical footprint on the Ruhr
  18. Deep drawing of high-strength tailored blanks by using tailored tools
  19. How to Do Materialistic Dialectics with Words?
  20. Reconfiguring Desecuritization
  21. Machine Learning for the Quantified Self
  22. Fallstudie
  23. Do they really care about targeted political ads? Investigation of user privacy concerns and preferences
  24. Complexity as experience
  25. Two high-mountain burnet moth species (Lepidoptera, Zygaenidae) react differently to the global change drivers climate and land-use
  26. Substrate preference determines macrofungal biogeography in the greater Mekong Sub-Region
  27. Assessing tree dendrometrics in young regenerating plantations using terrestrial laser scanning
  28. Sol-gel technology for greener and more sustainable antimicrobial textiles that use silica matrices with C, and Ag and ZnO as biocides
  29. Article 70 CISG