WikiEvents - A Novel Resource for NLP Downstream Tasks

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschungbegutachtet

Authors

Efficient Natural Language Processing (NLP) models require large amounts of training data. Manually creating training data is time-consuming. We present WikiEvents, an automatically curated dataset based on Wikipedia’s Current Events portal. WikiEvents is a novel knowledge graph that aims to provide data for various event-centric NLP tasks, such as event-related location extraction and entity linking. Therefore, WikiEvents includes event summaries with linked entities and locations. WikiEvents also provides spatial and temporal information about extracted events for various use case analyses. We leverage the NLP Interchange Format (NIF) ontology and an event-specific novel ontology - CoyPu. We evaluate the suitability regarding NLP tasks by (1) training three BERT models on event-related location extraction with data queried from WikiEvents and (2) comparing WikiEvents to the existing entity linking dataset AIDA-YAGO2. Qualitative, event-related research capabilities are explored by querying data from WikiEvents for multiple use cases and visualizing it.

OriginalspracheEnglisch
TitelESWC 2023 Workshops and Tutorials Joint Proceedings : Joint Proceedings of the ESWC 2023 Workshops and Tutorials, Hersonissos, Greece, May 28-29, 2023.
HerausgeberMehwish Alam, Cassia Trojahn, Sven Hertling, Catia Pesquita, Christian Aebeloe, Hidir Aras, Amr Azzam, Juan Cano, John Domingue, Simon Gottschalk, Olaf Hartig, Katja Hose, Sabrina Kirrane, Pasquale Lisena, Francesco Osborne, Philipp Rohde, Luc Steels, Ruben Taelman, Aisling Third, Ilaria Tiddi, Rima Türker
Band3443
VerlagSun Site Central Europe (RWTH Aachen University)
Erscheinungsdatum2023
PublikationsstatusErschienen - 2023
Extern publiziertJa
VeranstaltungJoint of the 20th European Semantic Web Conference - Workshops and Tutorials, ESWC-JP 2023 - Hersonissos, Griechenland
Dauer: 28.05.202329.05.2023
https://2023.eswc-conferences.org/about/

Bibliographische Notiz

Funding Information:
This research was supported by grants from NVIDIA and utilized NVIDIA 2 x RTX A5000 24GB. Furthermore, we acknowledge the financial support from the Federal Ministry for Economic Affairs and Energy of Germany in the project CoyPu ?project number 01MK21007[G]).

Publisher Copyright:
© 2023 Copyright for this paper by its authors.