What question did this study set out to answer?

This study aims to assess the role of linguistic markers alongside acoustic features in identifying depression in a clinical, Spanish-speaking population.

March 8, 2026Open Access

Beyond acoustic features: Incorporating linguistic variables in automatic speech analysis for depression detection

Key Points

This study aims to assess the role of linguistic markers alongside acoustic features in identifying depression in a clinical, Spanish-speaking population.
Evaluated 151 participants, including 80 with major or persistent depressive disorder and 71 healthy controls.
Participants answered 11 open-ended questions about depressive symptoms via a web platform.
Extracted linguistic and acoustic variables across prosodic, cepstral, spectral, and TEO-based categories.
Performed group comparisons and logistic regressions to analyze predictive values of features.
Utilized machine learning models to compare acoustic, linguistic, and ensemble classification performance.
TEO-based and cepstral features demonstrated the strongest predictive power for depression.
Linguistic features such as verb usage and vocabulary size were strong depression predictors after covariate adjustments.
The linguistic model outperformed the acoustic model with an AUC of 0.86 compared to 0.79.
The ensemble model combining both features achieved an accuracy of 0.84 and specificity of 0.93.
Optimal performance for individuals aged ≤45 years was noted, with an AUC of 0.90.

Abstract

Most research on automatic speech analysis (ASA) has focused on acoustic features, while the potential of linguistic markers remains underexplored, particularly in clinically diagnosed, non-English-speaking populations. This study evaluated the integration of acoustic and linguistic markers for detecting depression in a Spanish-speaking clinical sample. The sample comprised 151 participants: 80 patients with major depressive disorder (MDD) or persistent depressive disorder (PDD) recruited from the Psychiatry Department of Vall d'Hebron University Hospital and 71 healthy controls. Participants answered 11 open-ended questions related to depressive symptoms and well-being via a web-based platform. Linguistic and acoustic variables spanning four categories, namely, prosodic, cepstral, spectral, and Teager Energy Operator (TEO)-based features, were extracted. Group comparisons and logistic regressions were performed to assess the predictive value of acoustic and linguistic features. Machine learning models were used to compare the performance of acoustic, linguistic, and ensemble classification models, combining both feature sets. TEO-based and cepstral features showed the strongest predictive power. Greater use of verbs, reduced use of nouns and past-tense verbs, smaller vocabulary size, and increased use of shorter words and sentences remained strong predictors of depression after adjusting for covariates. The linguistic model outperformed the acoustic model (AUC = 0.86 vs. = 0.79), while the ensemble modelachieved comparable overall performance (AUC = 0.86), with slightly improved accuracy (0.84) and specificity (0.93). Integrating linguistic features into automated speech analysis shows promise for depression detection. With further validation and refinement, brief speech-based assessments could support early depression detection in primary care. • Combined acoustic and linguistic features for depression detection in a clinically diagnosed, Spanish-speaking sample. • Among acoustic features, TEO-based and cepstral features showed the strongest predictive power. • The linguistic model outperformed the acoustic model (AUC = 0.86 vs 0.79). • An ensemble model combining both acoustic and linguistic features showed the highest accuracy (0.84) and specificity (0.93). • Age-stratified ensemble analyses revealed optimal performance for individuals aged ≤45 years (AUC = 0.90).

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Patricia Laura Maran

Peru Gabirondo

Alexandra Vlaic

Journals

Journal of Affective Disorders

Actions

Institutions

Universitat Autònoma de Barcelona

Vall d'Hebron Hospital Universitari

Centro de Investigación Biomédica en Red de Salud Mental

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Beyond acoustic features: Incorporating linguistic variables in automatic speech analysis for depression detection

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider