Home | Publications | GDM+26

A Survey on Mental Health Datasets and Resources

MCML Authors

Bolei Ma

→ Group Frauke Kreuter
Social Data Science and AI

Abstract

Computational approaches to mental health have become an increasingly important area of AI research, supported by a growing number of datasets. This survey presents a dataset-centric review of mental health resources from 2001 to 2025, focusing on how mental-health states are defined, represented, and evaluated in NLP. We analyze datasets across modalities, organizing them by condition, data source, labeling strategy , task formulation, evaluation practice, and finally identify recurring challenges and opportunities in existing resources, aiming to inform the development of more clinically meaningful and responsible datasets. We aim to provide a holistically review to clarify the current states and guide the development of future mental health resources and applications.

misc GDM+26