Estimation of minimal data sets sizes for machine learning predictions in digital mental health interventions

Publikation: Beiträge in ZeitschriftenZeitschriftenaufsätzeForschungbegutachtet

Authors

  • Kirsten Zantvoort
  • Barbara Nacke
  • Dennis Görlich
  • Silvan Hornstein
  • Corinna Jacobi
  • Burkhardt Funk

Artificial intelligence promises to revolutionize mental health care, but small dataset sizes and lack of robust methods raise concerns about result generalizability. To provide insights on minimal necessary data set sizes, we explore domain-specific learning curves for digital intervention dropout predictions based on 3654 users from a single study (ISRCTN13716228, 26/02/2016). Prediction performance is analyzed based on dataset size (N = 100–3654), feature groups (F = 2–129), and algorithm choice (from Naive Bayes to Neural Networks). The results substantiate the concern that small datasets (N ≤ 300) overestimate predictive power. For uninformative feature groups, in-sample prediction performance was negatively correlated with dataset size. Sophisticated models overfitted in small datasets but maximized holdout test results in larger datasets. While N = 500 mitigated overfitting, performance did not converge until N = 750–1500. Consequently, we propose minimum dataset sizes of N = 500–1000. As such, this study offers an empirical reference for researchers designing or interpreting AI studies on Digital Mental Health Intervention data.

OriginalspracheEnglisch
Aufsatznummer361
Zeitschriftnpj Digital Medicine
Jahrgang7
Ausgabenummer1
Anzahl der Seiten10
DOIs
PublikationsstatusErschienen - 12.2024

Bibliographische Notiz

Publisher Copyright:
© The Author(s) 2024.

DOI

Zuletzt angesehen

Publikationen

  1. Knowledge Spaces of Globalization
  2. A web- And mobile-based intervention for comorbid, recurrent depression in patients with chronic back pain on sick leave (get.back)
  3. Negotiating boundaries through reality shows
  4. Giving is a question of time: response times and contributions to an environmental public good
  5. The effect of complacency potential on human operators’ monitoring behavior in aviation
  6. Communicating change, transition, and transformation for adaptation in agriculture: a comparative analysis of climate change communication in Aotearoa New Zealand.
  7. Predictive modeling in e-mental health
  8. Reduction of capital tie up for assembly processes
  9. Orientations for co-constructing a positive climate for diversity in teaching and learning
  10. where paintings live
  11. Where Paintings Live
  12. Towards an Extended Enterprise Architecture Meta-Model for Big Data
  13. The hidden hand that shapes conceptual understanding: Choosing effective representations for teaching cell division and climate change
  14. Predator diversity and abundance provide little support for the enemies hypothesis in forests of high tree diversity
  15. Das AGG in der Beratungspraxis
  16. The GLOBTEC Tech Adoption Tracker
  17. Embodiment of Science in Science Slams.
  18. Mining product configurator data
  19. How does nature contribute to human mobility? A conceptual framework and qualitative analysis
  20. Ontology-Guided, Hybrid Prompt Learning for Generalization in Knowledge Graph Question Answering
  21. Aging and Distal Effect Anticipation when Using Tools
  22. Timing, fragmentation of work and income inequality
  23. Estimation of physicochemical properties of 52 non-PBDE brominated flame retardants and evaluation of their overall persistence and long-range transport potential
  24. New Methods for the Analysis of Links between International Firm Activities and Firm Performance: A Practitioner’s Guide
  25. Explaining primary school teachers’ intention to use digital learning platforms for students’ individualized practice
  26. Towards 3D Process Simulation for In Situ Hybridization of Fiber-Metal-Laminates (FML)
  27. A sliding mode control using an extended Kalman filter as an observer for stimulus-responsive polymer fibres as actuator
  28. Sustainable from the Very Beginning
  29. Lernmodul „Ressourcenreflexion”
  30. Pragmatics broadly viewed
  31. Integrating Art and Education for Sustainable Development. A Transdisciplinary Working Process in the Context of Culture and Sustainability
  32. Response of saproxylic beetles to small-scale habitat connectivity depends on trophic levels
  33. Identifying past social-ecological thresholds to understand long-term temporal dynamics in Spain
  34. Axel Springer Verlag
  35. Effect of a Web-Based Guided Self-help Intervention for Prevention of Major Depression in Adults With Subthreshold Depression A Randomized Clinical Trial
  36. Public Information Messages
  37. Fishing for interpretation
  38. Implementation of EU labour law directives by way of national collective agreements
  39. SemREC-SMART 2022
  40. Using Large N Longitudinal Comparison to Explain Political Recruitment in Changing Democracies
  41. The relationship between empathic concern and perceived personal costs for helping and how it is affected by similarity perceptions
  42. Rose-tinted lens

Presse / Medien

  1. Weihnachtsfeiern