HySQA: Hybrid Scholarly Question Answering
Publikation: Beiträge in Sammelwerken › Kapitel › begutachtet
Standard
Proceedings of the 21st International Conference on Semantic Systems, 3-5 September 2025, Vienna, Austria. Band 62 2025. S. 247 (Studies on the Semantic Web).
Publikation: Beiträge in Sammelwerken › Kapitel › begutachtet
Harvard
APA
Vancouver
Bibtex
}
RIS
TY - CHAP
T1 - HySQA: Hybrid Scholarly Question Answering
AU - Taffa, Tilahun
AU - Banerjee, Debayan
AU - Assabie, Yaregal
AU - Usbeck, Ricardo
PY - 2025/8/26
Y1 - 2025/8/26
N2 - Purpose:The heterogeneity of scholarly information in knowledge graphs (KGs) and unstructured textual sources poses challenges in building robust Scholarly Question Answering (SQA) systems. Existing datasets and models typically address a narrow spectrum, focusing exclusively on KGs or unstructured sources and limiting evaluation to simple factoid questions. This gap leaves current systems unable to answer complex, hybrid scholarly questions that require integrating evidence from multiple heterogeneous data sources.Methodology:We introduce HySQA (Hybrid Scholarly Question Answering), a large-scale benchmarking dataset containing hybrid questions over scholarly KGs and Wikipedia text. HySQA contains complex questions that need to traverse facts across structured and unstructured sources. We also develop a baseline model that adaptively decomposes each question into sub-questions, identifies their answer sources, retrieves relevant information from SKGs and Wikipedia, and generates an answer using a hybrid augmented answer generation framework.Findings:The experimental results show that integrating static and adaptive decomposition methods is more effective than static decomposition alone.Value:Introducing HySQA provides the community with resources for evaluating the advancements in scholarly QA research.
AB - Purpose:The heterogeneity of scholarly information in knowledge graphs (KGs) and unstructured textual sources poses challenges in building robust Scholarly Question Answering (SQA) systems. Existing datasets and models typically address a narrow spectrum, focusing exclusively on KGs or unstructured sources and limiting evaluation to simple factoid questions. This gap leaves current systems unable to answer complex, hybrid scholarly questions that require integrating evidence from multiple heterogeneous data sources.Methodology:We introduce HySQA (Hybrid Scholarly Question Answering), a large-scale benchmarking dataset containing hybrid questions over scholarly KGs and Wikipedia text. HySQA contains complex questions that need to traverse facts across structured and unstructured sources. We also develop a baseline model that adaptively decomposes each question into sub-questions, identifies their answer sources, retrieves relevant information from SKGs and Wikipedia, and generates an answer using a hybrid augmented answer generation framework.Findings:The experimental results show that integrating static and adaptive decomposition methods is more effective than static decomposition alone.Value:Introducing HySQA provides the community with resources for evaluating the advancements in scholarly QA research.
M3 - Chapter
VL - 62
T3 - Studies on the Semantic Web
SP - 247
BT - Proceedings of the 21st International Conference on Semantic Systems, 3-5 September 2025, Vienna, Austria
ER -