Finding the Best Match — a Case Study on the (Text‑) Feature and Model Choice in Digital Mental Health Interventions
Research output: Journal contributions › Journal articles › Research › peer-review
Standard
In: Journal of Healthcare Informatics Research, Vol. 7, No. 4, 00148, 12.2023, p. 447-479.
Research output: Journal contributions › Journal articles › Research › peer-review
Harvard
APA
Vancouver
Bibtex
}
RIS
TY - JOUR
T1 - Finding the Best Match — a Case Study on the (Text‑) Feature and Model Choice in Digital Mental Health Interventions
AU - Zantvoort, Kirsten
AU - Scharfenberger, Jonas
AU - Boß, Leif
AU - Lehr, Dirk
AU - Funk, Burkhardt
N1 - Funding Information: Open Access funding enabled and organized by Projekt DEAL. The present study has been funded by Leuphana University. The original RCTs were funded by the European Union (project EFRE: CCI 2007DE161PR001). Publisher Copyright: © 2023, The Author(s).
PY - 2023/12
Y1 - 2023/12
N2 - With the need for psychological help long exceeding the supply, finding ways ofscaling, and better allocating mental health support is a necessity. This paper contributes by investigating how to best predict intervention dropout and failure to allow for a need-based adaptation of treatment. We systematically compare the predictive power of different text representation methods (metadata, TF-IDF, sentiment and topic analysis, and word embeddings) in combination with supplementary numerical inputs (socio-demographic, evaluation, and closed-question data). Additionally, we address the research gap of which ML model types — ranging from linear to sophisticated deep learning models — are best suited for different features and outcome variables. To this end, we analyze nearly 16.000 open-text answers from 849 German-speaking users in a Digital Mental Health Intervention (DMHI) for stress. Our research proves that — contrary to previous findings — there is great promise in using neural network approaches on DMHI text data. We propose a task-specific LSTM-based model architecture to tackle the challenge of long input sequences and thereby demonstrate the potential of word embeddings (AUC scores of up to 0.7) forpredictions in DMHIs. Despite the relatively small data set, sequential deep learning models, on average, outperform simpler features such as metadata and bag-of-words approaches when predicting dropout. The conclusion is that user-generated text of the first two sessions carries predictive power regarding patients’ dropout and intervention failure risk. Furthermore, the match between the sophistication of features and models needs to be closely considered to optimize results, and additional nontext features increase prediction results.
AB - With the need for psychological help long exceeding the supply, finding ways ofscaling, and better allocating mental health support is a necessity. This paper contributes by investigating how to best predict intervention dropout and failure to allow for a need-based adaptation of treatment. We systematically compare the predictive power of different text representation methods (metadata, TF-IDF, sentiment and topic analysis, and word embeddings) in combination with supplementary numerical inputs (socio-demographic, evaluation, and closed-question data). Additionally, we address the research gap of which ML model types — ranging from linear to sophisticated deep learning models — are best suited for different features and outcome variables. To this end, we analyze nearly 16.000 open-text answers from 849 German-speaking users in a Digital Mental Health Intervention (DMHI) for stress. Our research proves that — contrary to previous findings — there is great promise in using neural network approaches on DMHI text data. We propose a task-specific LSTM-based model architecture to tackle the challenge of long input sequences and thereby demonstrate the potential of word embeddings (AUC scores of up to 0.7) forpredictions in DMHIs. Despite the relatively small data set, sequential deep learning models, on average, outperform simpler features such as metadata and bag-of-words approaches when predicting dropout. The conclusion is that user-generated text of the first two sessions carries predictive power regarding patients’ dropout and intervention failure risk. Furthermore, the match between the sophistication of features and models needs to be closely considered to optimize results, and additional nontext features increase prediction results.
KW - E-mental health
KW - Health care analytics
KW - Machine learning
KW - Natural language processing
KW - Precision psychiatry
UR - http://www.scopus.com/inward/record.url?scp=85171458668&partnerID=8YFLogxK
UR - https://www.mendeley.com/catalogue/c1ccce9b-85a7-3e08-9249-6abbcf3b5023/
U2 - 10.1007/s41666-023-00148-z
DO - 10.1007/s41666-023-00148-z
M3 - Journal articles
C2 - 37927375
VL - 7
SP - 447
EP - 479
JO - Journal of Healthcare Informatics Research
JF - Journal of Healthcare Informatics Research
SN - 2509-498X
IS - 4
M1 - 00148
ER -