Fairness in Mobile Phone–Based Mental Health Assessment Algorithms: Exploratory Study (Preprint)-Reference-Cited by-同舟云学术

Fairness in Mobile Phone–Based Mental Health Assessment Algorithms: Exploratory Study (Preprint)

Published:2021-10-19 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Park Jinkyung^ORCID,Arunachalam Ramanathan^ORCID,Silenzio Vincent^ORCID,Singh Vivek K^ORCID

Abstract

BACKGROUND

Approximately 1 in 5 American adults experience mental illness every year. Thus, mobile phone–based mental health prediction apps that use phone data and artificial intelligence techniques for mental health assessment have become increasingly important and are being rapidly developed. At the same time, multiple artificial intelligence–related technologies (eg, face recognition and search results) have recently been reported to be biased regarding age, gender, and race. This study moves this discussion to a new domain: phone-based mental health assessment algorithms. It is important to ensure that such algorithms do not contribute to gender disparities through biased predictions across gender groups.

OBJECTIVE

This research aimed to analyze the susceptibility of multiple commonly used machine learning approaches for gender bias in mobile mental health assessment and explore the use of an algorithmic disparate impact remover (DIR) approach to reduce bias levels while maintaining high accuracy.

METHODS

First, we performed preprocessing and model training using the data set (N=55) obtained from a previous study. Accuracy levels and differences in accuracy across genders were computed using 5 different machine learning models. We selected the random forest model, which yielded the highest accuracy, for a more detailed audit and computed multiple metrics that are commonly used for fairness in the machine learning literature. Finally, we applied the DIR approach to reduce bias in the mental health assessment algorithm.

RESULTS

The highest observed accuracy for the mental health assessment was 78.57%. Although this accuracy level raises optimism, the audit based on gender revealed that the performance of the algorithm was statistically significantly different between the male and female groups (eg, difference in accuracy across genders was 15.85%; <i>P</i><.001). Similar trends were obtained for other fairness metrics. This disparity in performance was found to reduce significantly after the application of the DIR approach by adapting the data used for modeling (eg, the difference in accuracy across genders was 1.66%, and the reduction is statistically significant with <i>P</i><.001).

CONCLUSIONS

This study grounds the need for algorithmic auditing in phone-based mental health assessment algorithms and the use of gender as a protected attribute to study fairness in such settings. Such audits and remedial steps are the building blocks for the widespread adoption of fair and accurate mental health assessment algorithms in the future.

Publisher

JMIR Publications Inc.

Reference45 articles.

1. Dissecting racial bias in an algorithm used to manage the health of populations

2. Digital phenotyping for mental health of college students: a clinical review

3. Fairness-Aware Classifier with Prejudice Remover Regularizer

4. A Racially Unbiased, Machine Learning Approach to Prediction of Mortality: Algorithm Development Study

5. Potential Biases in Machine Learning Algorithms Using Electronic Health Record Data