HySQA: Hybrid Scholarly Question Answering

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

Purpose:
The heterogeneity of scholarly information in knowledge graphs (KGs) and unstructured textual sources poses challenges in building robust Scholarly Question Answering (SQA) systems. Existing datasets and models typically address a narrow spectrum, focusing exclusively on KGs or unstructured sources and limiting evaluation to simple factoid questions. This gap leaves current systems unable to answer complex, hybrid scholarly questions that require integrating evidence from multiple heterogeneous data sources.

Methodology:
We introduce HySQA (Hybrid Scholarly Question Answering), a large-scale benchmarking dataset containing hybrid questions over scholarly KGs and Wikipedia text. HySQA contains complex questions that need to traverse facts across structured and unstructured sources. We also develop a baseline model that adaptively decomposes each question into sub-questions, identifies their answer sources, retrieves relevant information from SKGs and Wikipedia, and generates an answer using a hybrid augmented answer generation framework.

Findings:
The experimental results show that integrating static and adaptive decomposition methods is more effective than static decomposition alone.

Value:
Introducing HySQA provides the community with resources for evaluating the advancements in scholarly QA research.
Original languageEnglish
Title of host publicationLinking Meaning: Semantic Technologies Shaping the Future of AI : Proceedings of the 21st International Conference on Semantic Systems, 3-5 September 2025, Vienna, Austria
EditorsBlerina Spahiu, Sahar Vahdati, Angelo Salatino, Tassilo Pellegrini, Giray Havur
Number of pages17
Place of PublicationAmsterdam
PublisherIOS Press BV
Publication date26.08.2025
Pages247-263
ISBN (electronic)978-1-64368-616-5
DOIs
Publication statusPublished - 26.08.2025
Event21st International Conference on Semantic Systems: Linking Meaning: Semantic Technologies Shaping the Future of AI - Wien, Austria
Duration: 03.09.202505.09.2025
Conference number: 21

    Research areas

  • Business informatics - Scholarly hybrid questions, Scholarly Question Answering, Hybrid Question Answering, Complex Question Answering

DOI

Recently viewed

Publications

  1. ETL ensembles for chunking, NER and SRL
  2. Cross-level Information and Influence in Mandated Participatory Planning: Alternative Pathways to Sustainable Water Management in Germany’s Implementation of the EU Water Framework Directive
  3. Optimisation of root traits to provide enhanced ecosystem services in agricultural systems
  4. Preference and willingness to pay for meat substitutes based on micro-algae
  5. Organizational Practices for the Aging Workforce
  6. Halb voll oder halb leer?
  7. The promise and Pitfalls of a blended, video- and coaching-based professional development program in Germany
  8. Design It!
  9. University mathematics students’ use of resources: strategies, purposes, and consequences
  10. Environmental heterogeneity modulates the effect of plant diversity on the spatial variability of grassland biomass
  11. 2. Advent
  12. Discussion report part 1
  13. Effect of laser peen forming process parameters on bending and surface quality of Ti-6Al-4V sheets
  14. Integration of material flow management tools in workplace environments
  15. Measuring plant root traits under controlled and field conditions
  16. The dependency of the banks’ assets and liabilities
  17. Teaching pragmatic competence with corpora: Intensification in expressions of gratitude across varieties
  18. Zum Begriff der Repräsentation
  19. Counteracting electric vehicle range concern with a scalable behavioural intervention
  20. Introduction
  21. Index und Irritation
  22. Effects of gadolinium and neodymium addition on young’s modulus of magnesium-based binary alloys
  23. Towards a global understanding of tree mortality
  24. Determiner Ellipsis in Electronic Writing - Discourse or Syntax?
  25. Water quantity and quality in the Zerafshan river basin - only an upstream riparian problem?
  26. Digital Design Strategies
  27. Integrated driver rostering problem in public bus transit
  28. Quantifying ecosystem services of rewetted peatlands − the MoorFutures methodologies
  29. I Am Not A Hacker
  30. Analysis of life cycle datasets for the material gold
  31. Species loss due to nutrient addition increases with spatial scale in global grasslands
  32. Wie Jugendliche schreiben
  33. Integrated reporting with CSR practices