Best Practices in AI and Data Science Models Evaluation

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Standard

Best Practices in AI and Data Science Models Evaluation. / Banerjee, Debayan; Taffa, Tilahun Abedissa; Usbeck, Ricardo.
INFORMATIK 2025 : The Wide Open - Offenheit von Source bis Science, 16.-19.September 2025 Potsdam. ed. / Ulrike Lucke; Stefan Stieglitz; Falk Uebernickel; Anna-Lena Lamprecht; Maike Klein. Bonn: Gesellschaft für Informatik, Bonn, 2025. p. 1211-1219 (Lecture Notes in Informatics; Vol. P366).

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Harvard

Banerjee, D, Taffa, TA & Usbeck, R 2025, Best Practices in AI and Data Science Models Evaluation. in U Lucke, S Stieglitz, F Uebernickel, A-L Lamprecht & M Klein (eds), INFORMATIK 2025 : The Wide Open - Offenheit von Source bis Science, 16.-19.September 2025 Potsdam. Lecture Notes in Informatics, vol. P366, Gesellschaft für Informatik, Bonn, Bonn, pp. 1211-1219. https://doi.org/10.18420/inf2025_105

APA

Banerjee, D., Taffa, T. A., & Usbeck, R. (2025). Best Practices in AI and Data Science Models Evaluation. In U. Lucke, S. Stieglitz, F. Uebernickel, A.-L. Lamprecht, & M. Klein (Eds.), INFORMATIK 2025 : The Wide Open - Offenheit von Source bis Science, 16.-19.September 2025 Potsdam (pp. 1211-1219). (Lecture Notes in Informatics; Vol. P366). Gesellschaft für Informatik, Bonn. https://doi.org/10.18420/inf2025_105

Vancouver

Banerjee D, Taffa TA, Usbeck R. Best Practices in AI and Data Science Models Evaluation. In Lucke U, Stieglitz S, Uebernickel F, Lamprecht AL, Klein M, editors, INFORMATIK 2025 : The Wide Open - Offenheit von Source bis Science, 16.-19.September 2025 Potsdam. Bonn: Gesellschaft für Informatik, Bonn. 2025. p. 1211-1219. (Lecture Notes in Informatics). doi: 10.18420/inf2025_105

Bibtex

@inbook{f31e3aed334e45eb8f9eb5a36f0778b7,
title = "Best Practices in AI and Data Science Models Evaluation",
abstract = "Evaluating Artificial Intelligence (AI) and data science models is crucial to ensure their reliability, fairness, and applicability in real-world scenarios. This paper highlights best practices for model evaluation, emphasizing the importance of selecting appropriate metrics aligned with business or research goals. Key considerations include using robust validation strategies (e.g., cross-validation), monitoring for overfitting, and ensuring data splits preserve class distributions. Fairness, interpretability, and reproducibility are essential, particularly in high-stakes domains like healthcare or finance. Additionally, evaluating models across multiple datasets or demographic subgroups helps uncover biases and improve generalizability. Adopting standardized reporting practices andopen-source benchmarks further strengthens the evaluation process. By adhering to these practices, practitioners can build more trustworthy and effective AI systems.",
keywords = "Business informatics, AI, Data science, Best Practices, Machine learning, Evaluation",
author = "Debayan Banerjee and Taffa, {Tilahun Abedissa} and Ricardo Usbeck",
year = "2025",
doi = "10.18420/inf2025_105",
language = "English",
series = "Lecture Notes in Informatics",
publisher = "Gesellschaft f{\"u}r Informatik, Bonn",
pages = "1211--1219",
editor = "Ulrike Lucke and Stefan Stieglitz and Falk Uebernickel and Anna-Lena Lamprecht and Maike Klein",
booktitle = "INFORMATIK 2025",

}

RIS

TY - CHAP

T1 - Best Practices in AI and Data Science Models Evaluation

AU - Banerjee, Debayan

AU - Taffa, Tilahun Abedissa

AU - Usbeck, Ricardo

PY - 2025

Y1 - 2025

N2 - Evaluating Artificial Intelligence (AI) and data science models is crucial to ensure their reliability, fairness, and applicability in real-world scenarios. This paper highlights best practices for model evaluation, emphasizing the importance of selecting appropriate metrics aligned with business or research goals. Key considerations include using robust validation strategies (e.g., cross-validation), monitoring for overfitting, and ensuring data splits preserve class distributions. Fairness, interpretability, and reproducibility are essential, particularly in high-stakes domains like healthcare or finance. Additionally, evaluating models across multiple datasets or demographic subgroups helps uncover biases and improve generalizability. Adopting standardized reporting practices andopen-source benchmarks further strengthens the evaluation process. By adhering to these practices, practitioners can build more trustworthy and effective AI systems.

AB - Evaluating Artificial Intelligence (AI) and data science models is crucial to ensure their reliability, fairness, and applicability in real-world scenarios. This paper highlights best practices for model evaluation, emphasizing the importance of selecting appropriate metrics aligned with business or research goals. Key considerations include using robust validation strategies (e.g., cross-validation), monitoring for overfitting, and ensuring data splits preserve class distributions. Fairness, interpretability, and reproducibility are essential, particularly in high-stakes domains like healthcare or finance. Additionally, evaluating models across multiple datasets or demographic subgroups helps uncover biases and improve generalizability. Adopting standardized reporting practices andopen-source benchmarks further strengthens the evaluation process. By adhering to these practices, practitioners can build more trustworthy and effective AI systems.

KW - Business informatics

KW - AI

KW - Data science

KW - Best Practices

KW - Machine learning

KW - Evaluation

UR - https://dl.gi.de/items/1739d595-ebff-416d-b476-8a5344e0e9d6

UR - https://dl.gi.de/collections/910b20e6-455a-4929-a0cd-6f12210ce5f4

U2 - 10.18420/inf2025_105

DO - 10.18420/inf2025_105

M3 - Article in conference proceedings

T3 - Lecture Notes in Informatics

SP - 1211

EP - 1219

BT - INFORMATIK 2025

A2 - Lucke, Ulrike

A2 - Stieglitz, Stefan

A2 - Uebernickel, Falk

A2 - Lamprecht, Anna-Lena

A2 - Klein, Maike

PB - Gesellschaft für Informatik, Bonn

CY - Bonn

ER -