Gerbil – Benchmarking named entity recognition and linking consistently

Publikation: Beiträge in ZeitschriftenZeitschriftenaufsätzeForschungbegutachtet

Authors

The ability to compare systems from the same domain is of central importance for their introduction into complex applications. In the domains of named entity recognition and entity linking, the large number of systems and their orthogonal evaluation w.r.t. measures and datasets has led to an unclear landscape regarding the abilities and weaknesses of the different approaches. We present GERBIL—an improved platform for repeatable, storable and citable semantic annotation experiments—and its extension since being release. GERBIL has narrowed this evaluation gap by generating concise, archivable, human- and machine-readable experiments, analytics and diagnostics. The rationale behind our framework is to provide developers, end users and researchers with easy-to-use interfaces that allow for the agile, fine-grained and uniform evaluation of annotation tools on multiple datasets. By these means, we aim to ensure that both tool developers and end users can derive meaningful insights into the extension, integration and use of annotation applications. In particular, GERBIL provides comparable results to tool developers, simplifying the discovery of strengths and weaknesses of their implementations with respect to the state-of-the-art. With the permanent experiment URIs provided by our framework, we ensure the reproducibility and archiving of evaluation results. Moreover, the framework generates data in a machine-processable format, allowing for the efficient querying and post-processing of evaluation results. Additionally, the tool diagnostics provided by GERBIL provide insights into the areas where tools need further refinement, thus allowing developers to create an informed agenda for extensions and end users to detect the right tools for their purposes. Finally, we implemented additional types of experiments including entity typing. GERBIL aims to become a focal point for the state-of-the-art, driving the research agenda of the community by presenting comparable objective evaluation results. Furthermore, we tackle the central problem of the evaluation of entity linking, i.e., we answer the question of how an evaluation algorithm can compare two URIs to each other without being bound to a specific knowledge base. Our approach to this problem opens a way to address the deprecation of URIs of existing gold standards for named entity recognition and entity linking, a feature which is currently not supported by the state-of-the-art. We derived the importance of this feature from usage and dataset requirements collected from the GERBIL user community, which has already carried out more than 24.000 single evaluations using our framework. Through the resulting updates, GERBIL now supports 8 tasks, 46 datasets and 20 systems.

OriginalspracheEnglisch
ZeitschriftSemantic Web
Jahrgang9
Ausgabenummer5
ISSN1570-0844
DOIs
PublikationsstatusErschienen - 2018
Extern publiziertJa

Bibliographische Notiz

Funding Information:
This work was supported by the German Federal Ministry of Education and Research under the project number 03WKCJ4D and the Eurostars projects DIESEL (E!9367) and QAMEL (E!9725) as well as the European Union’s H2020 research and innovation action HOBBIT under the Grant Agreement number 688227.

Funding Information:
Acknowledgments. This work was supported by the German Federal Ministry of Education and Research under the project number 03WKCJ4D and the Eurostars projects DIESEL (E!9367) and QAMEL (E!9725) as well as the European Union’s H2020 research and innovation action HOBBIT under the Grant Agreement number 688227.

Publisher Copyright:
© 2018 IOS Press. All rights reserved.

DOI

Zuletzt angesehen

Publikationen

  1. Introduction: Habitual Action, Automaticity, and Control
  2. Practice and carryover effects when using small interaction devices
  3. Teaching Sustainable Development in a Sensory and Artful Way — Concepts, Methods, and Examples
  4. Influence of Mg content in Al alloys on processing characteristics and dynamically recrystallized microstructure of friction surfacing deposits
  5. Stimulating Computing
  6. Comparison of three methods of length compensation in a parallel kinematic and their equivalence conditions
  7. Can a Revision of the Universal Service Scope Result in Substantive Change?
  8. Modeling and simulation of the heterogenous material behavior in thermal-sprayed coatings
  9. Sliding Mode Control of an Inductive Power Transmission System with Maximum Efficiency
  10. Short-arc measurement and fitting based on the bidirectional prediction of observed data
  11. Graph-Based Early-Fusion for Flood Detection
  12. Short and long-term dominance of negative information in shaping public energy perceptions
  13. Deconstructing and reconstructing diversity in client-provider-relationships of social work
  14. A New Approach for Optimal Solving Cyclic and Non-Cyclic Bus Drvier Rostering Problems
  15. Vielfalt des Alterns - Differenz oder Integration?
  16. An introduction to sliding mode control for interdisciplinary education
  17. The role of place in shaping responsibility logics
  18. On the Equivalence of Transmission Problems in Nonoverlapping Domain Decomposition Methods for Quasilinear PDEs
  19. Executive function and Language Learning
  20. Evaluating A Teaching-Learning Sequence (TLS) About Acid-Base Reactions In Upper Secondary School
  21. Developing robust field survey protocols in landscape ecology
  22. Managing Global Production Networks
  23. Finding Datasets in Publications: The University of Paderborn Approach
  24. Operational integration of EMIS and ERP systems
  25. A model of a servo piezo mechanical hydraulic actuator and its regulation using repetitive control
  26. Factored MDPs for detecting topics of user sessions
  27. How secondary-school students deal with issues of sustainable development in class*
  28. Programmierung einer DELTA-Roboterzelle nach PackML Standard
  29. ENVISIONING PROTECTED AREAS THROUGH PARTICIPATORY SCENARIO PLANNING: NAVIGATING COVERAGE AND EFFECTIVENESS CHALLENGES AHEAD
  30. Extension of SEIR compartmental models for constructive Lyapunov control of COVID-19 and analysis in terms of practical stability
  31. Combining Model Predictive and Adaptive Control for an Atomic Force Microscope Piezo-Scanner-Cantilever System
  32. A Control of an Electromagnetic Actuator Using Model Predictive Control
  33. Investigating quality raters' performance using interface evaluation methods