Finding the Best Match — a Case Study on the (Text‑) Feature and Model Choice in Digital Mental Health Interventions

Kirsten Zantvoort; Jonas Scharfenberger; Leif Boß; Dirk Lehr; Burkhardt Funk

doi:10.1007/s41666-023-00148-z

Finding the Best Match — a Case Study on the (Text‑) Feature and Model Choice in Digital Mental Health Interventions

Research output: Journal contributions › Journal articles › Research › peer-review

Standard

Finding the Best Match — a Case Study on the (Text‑) Feature and Model Choice in Digital Mental Health Interventions. / Zantvoort, Kirsten; Scharfenberger, Jonas ; Boß, Leif et al.
In: Journal of Healthcare Informatics Research, Vol. 7, No. 4, 00148, 12.2023, p. 447-479.

Research output: Journal contributions › Journal articles › Research › peer-review

Bibtex

@article{c90d7531bb5b4c94911b929e62aade9b,

title = "Finding the Best Match — a Case Study on the (Text‑) Feature and Model Choice in Digital Mental Health Interventions",

abstract = "With the need for psychological help long exceeding the supply, finding ways ofscaling, and better allocating mental health support is a necessity. This paper contributes by investigating how to best predict intervention dropout and failure to allow for a need-based adaptation of treatment. We systematically compare the predictive power of different text representation methods (metadata, TF-IDF, sentiment and topic analysis, and word embeddings) in combination with supplementary numerical inputs (socio-demographic, evaluation, and closed-question data). Additionally, we address the research gap of which ML model types — ranging from linear to sophisticated deep learning models — are best suited for different features and outcome variables. To this end, we analyze nearly 16.000 open-text answers from 849 German-speaking users in a Digital Mental Health Intervention (DMHI) for stress. Our research proves that — contrary to previous findings — there is great promise in using neural network approaches on DMHI text data. We propose a task-specific LSTM-based model architecture to tackle the challenge of long input sequences and thereby demonstrate the potential of word embeddings (AUC scores of up to 0.7) forpredictions in DMHIs. Despite the relatively small data set, sequential deep learning models, on average, outperform simpler features such as metadata and bag-of-words approaches when predicting dropout. The conclusion is that user-generated text of the first two sessions carries predictive power regarding patients{\textquoteright} dropout and intervention failure risk. Furthermore, the match between the sophistication of features and models needs to be closely considered to optimize results, and additional nontext features increase prediction results.",

keywords = "E-mental health, Health care analytics, Machine learning, Natural language processing, Precision psychiatry",

author = "Kirsten Zantvoort and Jonas Scharfenberger and Leif Bo{\ss} and Dirk Lehr and Burkhardt Funk",

note = "Funding Information: Open Access funding enabled and organized by Projekt DEAL. The present study has been funded by Leuphana University. The original RCTs were funded by the European Union (project EFRE: CCI 2007DE161PR001). Publisher Copyright: {\textcopyright} 2023, The Author(s).",

year = "2023",

month = dec,

doi = "10.1007/s41666-023-00148-z",

language = "English",

volume = "7",

pages = "447--479",

journal = "Journal of Healthcare Informatics Research",

issn = "2509-498X",

publisher = "Springer International Publishing",

number = "4",

}

RIS

TY - JOUR

T1 - Finding the Best Match — a Case Study on the (Text‑) Feature and Model Choice in Digital Mental Health Interventions

AU - Zantvoort, Kirsten

AU - Scharfenberger, Jonas

AU - Boß, Leif

AU - Lehr, Dirk

AU - Funk, Burkhardt

N1 - Funding Information: Open Access funding enabled and organized by Projekt DEAL. The present study has been funded by Leuphana University. The original RCTs were funded by the European Union (project EFRE: CCI 2007DE161PR001). Publisher Copyright: © 2023, The Author(s).

PY - 2023/12

Y1 - 2023/12

N2 - With the need for psychological help long exceeding the supply, finding ways ofscaling, and better allocating mental health support is a necessity. This paper contributes by investigating how to best predict intervention dropout and failure to allow for a need-based adaptation of treatment. We systematically compare the predictive power of different text representation methods (metadata, TF-IDF, sentiment and topic analysis, and word embeddings) in combination with supplementary numerical inputs (socio-demographic, evaluation, and closed-question data). Additionally, we address the research gap of which ML model types — ranging from linear to sophisticated deep learning models — are best suited for different features and outcome variables. To this end, we analyze nearly 16.000 open-text answers from 849 German-speaking users in a Digital Mental Health Intervention (DMHI) for stress. Our research proves that — contrary to previous findings — there is great promise in using neural network approaches on DMHI text data. We propose a task-specific LSTM-based model architecture to tackle the challenge of long input sequences and thereby demonstrate the potential of word embeddings (AUC scores of up to 0.7) forpredictions in DMHIs. Despite the relatively small data set, sequential deep learning models, on average, outperform simpler features such as metadata and bag-of-words approaches when predicting dropout. The conclusion is that user-generated text of the first two sessions carries predictive power regarding patients’ dropout and intervention failure risk. Furthermore, the match between the sophistication of features and models needs to be closely considered to optimize results, and additional nontext features increase prediction results.

AB - With the need for psychological help long exceeding the supply, finding ways ofscaling, and better allocating mental health support is a necessity. This paper contributes by investigating how to best predict intervention dropout and failure to allow for a need-based adaptation of treatment. We systematically compare the predictive power of different text representation methods (metadata, TF-IDF, sentiment and topic analysis, and word embeddings) in combination with supplementary numerical inputs (socio-demographic, evaluation, and closed-question data). Additionally, we address the research gap of which ML model types — ranging from linear to sophisticated deep learning models — are best suited for different features and outcome variables. To this end, we analyze nearly 16.000 open-text answers from 849 German-speaking users in a Digital Mental Health Intervention (DMHI) for stress. Our research proves that — contrary to previous findings — there is great promise in using neural network approaches on DMHI text data. We propose a task-specific LSTM-based model architecture to tackle the challenge of long input sequences and thereby demonstrate the potential of word embeddings (AUC scores of up to 0.7) forpredictions in DMHIs. Despite the relatively small data set, sequential deep learning models, on average, outperform simpler features such as metadata and bag-of-words approaches when predicting dropout. The conclusion is that user-generated text of the first two sessions carries predictive power regarding patients’ dropout and intervention failure risk. Furthermore, the match between the sophistication of features and models needs to be closely considered to optimize results, and additional nontext features increase prediction results.

KW - E-mental health

KW - Health care analytics

KW - Machine learning

KW - Natural language processing

KW - Precision psychiatry

UR - http://www.scopus.com/inward/record.url?scp=85171458668&partnerID=8YFLogxK

UR - https://www.mendeley.com/catalogue/c1ccce9b-85a7-3e08-9249-6abbcf3b5023/

U2 - 10.1007/s41666-023-00148-z

DO - 10.1007/s41666-023-00148-z

M3 - Journal articles

C2 - 37927375

VL - 7

SP - 447

EP - 479

JO - Journal of Healthcare Informatics Research

JF - Journal of Healthcare Informatics Research

SN - 2509-498X

IS - 4

M1 - 00148

ER -

Other publications by the same author(s)

Assessing the Cultural Fit of a Digital Sleep Intervention for Refugees in Germany: Qualitative Study

Blomenkamp, M., Kiesel, A., Baumeister, H., Lehr, D., Unterrainer, J., Sander, L. B. & Spanhel, K., 03.04.2025, In: JMIR Formative Research. 9, 15 p., e65412.

Research output: Journal contributions › Journal articles › Research › peer-review

Capitalizing on natural language processing (NLP) to automate the evaluation of coach implementation fidelity in guided digital cognitive-behavioral therapy (GdCBT)

Zainal, N. H., Eckhardt, R., Rackoff, G. N., Fitzsimmons-Craft, E. E., Rojas-Ashe, E., Barr Taylor, C., Funk, B., Eisenberg, D., Wilfley, D. E. & Newman, M. G., 02.04.2025, In: Psychological Medicine. 55, e106.

Research output: Journal contributions › Journal articles › Research › peer-review

Construct relation extraction from scientific papers: Is it automatable yet?

Funk, B. & Scharfenberger, J., 07.01.2025, Proceedings of the 58th Hawaii International Conference on System Sciences, HICSS 2025. Bui, T. X. (ed.). Honolulu: University of Hawaii at Manoa, p. 4675-4684 10 p. (Hawaii International Conference on System Sciences (HICSS); vol. 2025).

Research output: Contributions to collected editions/works › Published abstract in conference proceedings › Research › peer-review

Effectiveness of a gratitude app at reducing repetitive negative thinking as a transdiagnostic risk factor in the general population: Results from a pragmatic randomized controlled trial

Kalon, L. S., Freund, H., Rinn, A., Watkins, P. C., Zarski, A. C. & Lehr, D., 15.11.2025, In: Journal of Affective Disorders. 389, 11 p., 119664.

Research output: Journal contributions › Journal articles › Research › peer-review

Effectiveness of an integrated platform-based intervention for promoting psychosocial safety climate and mental health in nursing staff: A pragmatic cluster randomised controlled trial

Boß, L., Ross, J., Reis, D., Pischel, S., Mallwitz, T., Brückner, H., Tanner, G., Nissen, H., Kalon, L., Schümann, M., Lennefer, T., Janneck, M., Felfe, J., Ducki, A. & Lehr, D., 01.07.2025, In: International Journal of Nursing Studies. 167, 14 p., 105076.

Research output: Journal contributions › Journal articles › Research › peer-review

DOI

https://doi.org/10.1007/s41666-023-00148-z
Final published version

Finding the Best Match — a Case Study on the (Text‑) Feature and Model Choice in Digital Mental Health Interventions

Standard

Harvard

APA

Vancouver

Bibtex

RIS

Other publications by the same author(s)

Assessing the Cultural Fit of a Digital Sleep Intervention for Refugees in Germany: Qualitative Study

Capitalizing on natural language processing (NLP) to automate the evaluation of coach implementation fidelity in guided digital cognitive-behavioral therapy (GdCBT)

Construct relation extraction from scientific papers: Is it automatable yet?

Effectiveness of a gratitude app at reducing repetitive negative thinking as a transdiagnostic risk factor in the general population: Results from a pragmatic randomized controlled trial

Effectiveness of an integrated platform-based intervention for promoting psychosocial safety climate and mental health in nursing staff: A pragmatic cluster randomised controlled trial

DOI

Recently viewed

Activities

Publications

Press / Media