Affiliation:
1. School of Computing, Mohan Babu University, India
Abstract
Traditionally, PHQ scores and patient interviews were used to diagnose depression; however, the accuracy of these measures is quite low. In this work, a hybrid model that primarily integrates textual and audio aspects of patient answers is proposed. Using the DAIC-WoZ database, behavioral traits of depressed patients are studied. The proposed method is comprised of three parts: a textual ConvNets model that is trained solely on textual features; an audio CNN model that is trained solely on audio features; and a hybrid model that combines textual and audio features and uses LSTM algorithms. The suggested study also makes use of the Bi-LSTM model, an enhanced variant of the LSTM model. The findings indicate that deep learning is a more effective method for detecting depression, with textual CNN models having 92% of accuracy and audio CNN models having 98% of accuracy. Textual CNN loss is 0.2 while audio CNN loss is 0.1. These findings demonstrate the efficacy of audio CNN as a depression detection model. When compared to the textual ConvNets model, it performs better.