Construct relation extraction from scientific papers: Is it automatable yet?

Research output: Contributions to collected editions/worksPublished abstract in conference proceedingsResearchpeer-review

Authors

The process of identifying relevant prior research
articles is crucial for theoretical advancements, but
often requires significant human effort. This study
examines the feasibility of using large language
models (LLMs) to support this task by extracting
tested hypotheses, which consist of related constructs,
moderators or mediators, path coefficients, and
p-values, from empirical studies using structural
equation modeling (SEM). We combine state-of-the-art
LLMs with a variety of post-processing measures
to improve the relation extraction quality. An
extensive evaluation yields recall scores of up to
79.2% in construct entity extraction, 58.4% in
construct-mediator/moderator-construct extraction,
and 39.3% in extracting the full tested hypotheses.
We provide a manually annotated dataset of 72 SEM
articles and 749 construct relations to facilitate future
research. Our findings offer critical insights and
suggest promising directions for advancing the field of
automated construct relation extraction from scholarly
documents.
Original languageEnglish
Title of host publicationProceedings of the 58th Hawaii International Conference on System Sciences 2025
Number of pages4684
Publication date2025
Pages4675
ISBN (electronic)978-0-9981331-8-8
Publication statusPublished - 2025
Event58th Hawaii International Conference on System Sciences - HICSS 2025 - Hilton Waikoloa Village, Waikoloa, United States
Duration: 07.01.202510.01.2025
Conference number: 58