Surveying the FAIRness of Annotation Tools: Difficult to find, difficult to reuse

Ekaterina Borisova; Raia Abu Ahmad; Leyla Jael Garcia-Castro; Ricardo Usbeck; Georg Rehm

Surveying the FAIRness of Annotation Tools: Difficult to find, difficult to reuse

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

Authors

Ekaterina Borisova
Raia Abu Ahmad
Leyla Jael Garcia-Castro
Ricardo Usbeck
Georg Rehm

Professorship for Information Systems, in particular Artificial Intelligence and Explainability

In the realm of Machine Learning and Deep Learning, there is a need for high-quality annotated data to train and evaluate supervised models. An extensive number of annotation tools have been developed to facilitate the data labelling process. However, finding the right tool is a demanding task involving thorough searching and testing. Hence, to effectively navigate the multitude of tools, it becomes essential to ensure their findability, accessibility, interoperability, and reusability (FAIR). This survey addresses the FAIRness of existing annotation software by evaluating 50 different tools against the FAIR principles for research software (FAIR4RS). The study indicates that while being accessible and interoperable, annotation tools are difficult to find and reuse. In addition, there is a need to establish community standards for annotation software development, documentation, and distribution.

Original language	English
Title of host publication	LAW 2024 - 18th Linguistic Annotation Workshop, Co-located with EACL 2024 - Proceedings of the Workshop : Proceedings of the Workshop
Editors	Sophie Henning, Manfred Stede
Number of pages	17
Place of Publication	Stroudsburg
Publisher	Association for Computational Linguistics (ACL)
Publication date	01.03.2024
Pages	29-45
ISBN (electronic)	979-8-89176-073-8
Publication status	Published - 01.03.2024
Event	18th Linguistic Annotation Workshop - St. Julians, Malta Duration: 21.03.2024 → 22.03.2024 Conference number: 18 https://www.aclweb.org/portal/content/first-call-papers-18th-linguistic-annotation-workshop

Bibliographical note

Publisher Copyright:
© 2024 Association for Computational Linguistics.

Research areas

Business informatics

Other publications by the same author(s)

ShortPathQA: A Dataset for Controllable Fusion of Large Language Models with Knowledge Graphs

Salnikov, M., Sakhovskiy, A., Nikishina, I., Usmanova, A., Kraft, A., Möller, C., Banerjee, D., Huang, J., Jiang, L., Abdullah, R., Yan, X., Tutubalina, E., Usbeck, R. & Panchenko, A., 2026, Natural Language Processing and Information Systems: 30th International Conference on Applications of Natural Language to Information Systems, NLDB 2025, Proceedings. Ichise, R. (ed.). Springer Science and Business Media Deutschland, p. 95-110 16 p. (Lecture Notes in Computer Science; vol. 15836 LNCS).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

Analyzing the Influence of Knowledge Graph Information on Relation Extraction

Möller, C. & Usbeck, R., 2025, The Semantic Web: 22nd European Semantic Web Conference, ESWC 2025 Portoroz, Slovenia, June 1–5, 2025 Proceedings, Part I. Curry, E., Acosta, M., Poveda-Villalón, M., van Erp, M., Ojo, A., Hose, K., Shimizu, C. & Lisena, P. (eds.). Cham: Springer Nature Switzerland AG, Vol. 1. p. 460-480 21 p. (Lecture Notes in Computer Science ; vol. 15718).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

ASK-DBLP: Answering Questions over DBLP

Taffa, T., Neises, P., Ollinger, S., Westphal, P., Ackermann, M. R., Banerjee, D. & Usbeck, R., 02.11.2025, ISWC-C 2025, Industry, Doctoral Consortium, Posters and Demos at ISWC 2025: Joint Proceedings of Industry, Doctoral Consortium, Posters and Demos of the 24th International Semantic Web Conference (ISWC-C 2025), ISWC 2025 Companion Volume. Celino, I., Hassanzadeh, O., Bernstein, A., Noy, N., Cheng, G., Wang, S., Ferrada, S., Soulard, T., Kozaki, K., Takeda, H. & Gentile, A. L. (eds.). Aachen: Sun Site Central Europe (RWTH Aachen University), p. 435-440 6 p. D13. (CEUR Workshop Proceedings; vol. 4085).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

Automating SPARQL Query Translations between DBpedia and Wikidata

Bartels, M. C., Banerjee, D. & Usbeck, R., 14.07.2025, Linking Meaning: Semantic Technologies Shaping the Future of AI: Cover 74617 Proceedings of the 21st International Conference on Semantic Systems, 3-5 September 2025, Vienna, Austria. Spahiu, B., Vahdati, S., Salatino, A., Pellegrini, T. & Havur, G. (eds.). IOS Press BV, p. 176-193 18 p. (Studies on the Semantic Web; vol. 62).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research

Best Practices in AI and Data Science Models Evaluation

Banerjee, D., Taffa, T. A. & Usbeck, R., 2025, INFORMATIK 2025 : The Wide Open - Offenheit von Source bis Science, 16.-19.September 2025 Potsdam. Lucke, U., Stieglitz, S., Uebernickel, F., Lamprecht, A.-L. & Klein, M. (eds.). Bonn: Gesellschaft für Informatik, Bonn, p. 1211-1219 9 p. (Lecture Notes in Informatics; vol. P366).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review