Best Practices in AI and Data Science Models Evaluation

Debayan Banerjee; Tilahun Abedissa Taffa; Ricardo Usbeck

doi:10.18420/inf2025_105

Best Practices in AI and Data Science Models Evaluation

Publikation: Beiträge in Sammelwerken › Aufsätze in Konferenzbänden › Forschung › begutachtet

Authors

Professur für Wirtschaftsinformatik, insbesondere Künstliche Intelligenz und Erklärbarkeit

Evaluating Artificial Intelligence (AI) and data science models is crucial to ensure their reliability, fairness, and applicability in real-world scenarios. This paper highlights best practices for model evaluation, emphasizing the importance of selecting appropriate metrics aligned with business or research goals. Key considerations include using robust validation strategies (e.g., cross-validation), monitoring for overfitting, and ensuring data splits preserve class distributions. Fairness, interpretability, and reproducibility are essential, particularly in high-stakes domains like healthcare or finance. Additionally, evaluating models across multiple datasets or demographic subgroups helps uncover biases and improve generalizability. Adopting standardized reporting practices and
open-source benchmarks further strengthens the evaluation process. By adhering to these practices, practitioners can build more trustworthy and effective AI systems.

Originalsprache	Englisch
Titel	INFORMATIK 2025 : The Wide Open - Offenheit von Source bis Science, 16.-19.September 2025 Potsdam
Herausgeber	Ulrike Lucke, Stefan Stieglitz, Falk Uebernickel, Anna-Lena Lamprecht, Maike Klein
Anzahl der Seiten	9
Erscheinungsort	Bonn
Verlag	Gesellschaft für Informatik, Bonn
Erscheinungsdatum	2025
Seiten	1211-1219
DOIs	https://doi.org/10.18420/inf2025_105
Publikationsstatus	Erschienen - 2025

Fachgebiete

Wirtschaftsinformatik

Weitere Publikationen dieser Person(en)

ShortPathQA: A Dataset for Controllable Fusion of Large Language Models with Knowledge Graphs

Salnikov, M., Sakhovskiy, A., Nikishina, I., Usmanova, A., Kraft, A., Möller, C., Banerjee, D., Huang, J., Jiang, L., Abdullah, R., Yan, X., Tutubalina, E., Usbeck, R. & Panchenko, A., 2026, Natural Language Processing and Information Systems: 30th International Conference on Applications of Natural Language to Information Systems, NLDB 2025, Proceedings. Ichise, R. (Hrsg.). Springer Science and Business Media Deutschland, S. 95-110 16 S. (Lecture Notes in Computer Science; Band 15836 LNCS).

Publikation: Beiträge in Sammelwerken › Aufsätze in Konferenzbänden › Forschung › begutachtet

Analyzing the Influence of Knowledge Graph Information on Relation Extraction

Möller, C. & Usbeck, R., 2025, The Semantic Web: 22nd European Semantic Web Conference, ESWC 2025 Portoroz, Slovenia, June 1–5, 2025 Proceedings, Part I. Curry, E., Acosta, M., Poveda-Villalón, M., van Erp, M., Ojo, A., Hose, K., Shimizu, C. & Lisena, P. (Hrsg.). Cham: Springer Nature Switzerland AG, Band 1. S. 460-480 21 S. (Lecture Notes in Computer Science ; Band 15718).

Publikation: Beiträge in Sammelwerken › Aufsätze in Konferenzbänden › Forschung › begutachtet

ASK-DBLP: Answering Questions over DBLP

Taffa, T., Neises, P., Ollinger, S., Westphal, P., Ackermann, M. R., Banerjee, D. & Usbeck, R., 02.11.2025, ISWC-C 2025, Industry, Doctoral Consortium, Posters and Demos at ISWC 2025: Joint Proceedings of Industry, Doctoral Consortium, Posters and Demos of the 24th International Semantic Web Conference (ISWC-C 2025), ISWC 2025 Companion Volume. Celino, I., Hassanzadeh, O., Bernstein, A., Noy, N., Cheng, G., Wang, S., Ferrada, S., Soulard, T., Kozaki, K., Takeda, H. & Gentile, A. L. (Hrsg.). Aachen: Sun Site Central Europe (RWTH Aachen University), S. 435-440 6 S. D13. (CEUR Workshop Proceedings; Band 4085).

Publikation: Beiträge in Sammelwerken › Aufsätze in Konferenzbänden › Forschung › begutachtet

Automating SPARQL Query Translations between DBpedia and Wikidata

Bartels, M. C., Banerjee, D. & Usbeck, R., 14.07.2025, Linking Meaning: Semantic Technologies Shaping the Future of AI: Cover 74617 Proceedings of the 21st International Conference on Semantic Systems, 3-5 September 2025, Vienna, Austria. Spahiu, B., Vahdati, S., Salatino, A., Pellegrini, T. & Havur, G. (Hrsg.). IOS Press BV, S. 176-193 18 S. (Studies on the Semantic Web; Band 62).

Publikation: Beiträge in Sammelwerken › Aufsätze in Konferenzbänden › Forschung

Bridge-Generate: Scholarly Hybrid Question Answering

Taffa, T. A. & Usbeck, R., 23.05.2025, WWW Companion 2025 - Companion Proceedings of the ACM Web Conference 2025: Companion Proceedings of the ACM Web Conference 2025, April 28-May 2, 2025 Sydney, NSW, Australia. Long, G., Blumestein, M., Chang, Y., Lewin-Eytan, L., Huang, H. & Yom-Tov, E. (Hrsg.). New York: Association for Computing Machinery, Inc, S. 1321-1325 5 S.

Publikation: Beiträge in Sammelwerken › Aufsätze in Konferenzbänden › Forschung › begutachtet

DOI

https://doi.org/10.18420/inf2025_105
Endgültige, publizierte Fassung

Best Practices in AI and Data Science Models Evaluation

Authors

Fachgebiete

Weitere Publikationen dieser Person(en)

ShortPathQA: A Dataset for Controllable Fusion of Large Language Models with Knowledge Graphs

Analyzing the Influence of Knowledge Graph Information on Relation Extraction

ASK-DBLP: Answering Questions over DBLP

Automating SPARQL Query Translations between DBpedia and Wikidata

Bridge-Generate: Scholarly Hybrid Question Answering

Links

DOI

Zuletzt angesehen

Projekte

Aktivitäten

Publikationen

Presse / Medien