Investigating and Mitigating the Performance–Fairness Tradeoff via Protected-Category Sampling-Reference-Cited by-同舟云学术

Investigating and Mitigating the Performance–Fairness Tradeoff via Protected-Category Sampling

Published:2024-07-31 Issue:15 Volume:13 Page:3024
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Popoola Gideon¹,Sheppard John¹^ORCID

Affiliation:

1. Gianforte School of Computing, Montana State University, Bozeman, MT 59717, USA

Abstract

Machine learning algorithms have become common in everyday decision making, and decision-assistance systems are ubiquitous in our everyday lives. Hence, research on the prevention and mitigation of potential bias and unfairness of the predictions made by these algorithms has been increasing in recent years. Most research on fairness and bias mitigation in machine learning often treats each protected variable separately, but in reality, it is possible for one person to belong to multiple protected categories. Hence, in this work, combining a set of protected variables and generating new columns that separate these protected variables into many subcategories was examined. These new subcategories tend to be extremely imbalanced, so bias mitigation was approached as an imbalanced classification problem. Specifically, four new custom sampling methods were developed and investigated to sample these new subcategories. These new sampling methods are referred to as protected-category oversampling, protected-category proportional sampling, protected-category Synthetic Minority Oversampling Technique (PC-SMOTE), and protected-category Adaptive Synthetic Sampling (PC-ADASYN). These sampling methods modify the existing sampling method by focusing their sampling on the new subcategories rather than the class label. The impact of these sampling strategies was then evaluated based on classical performance and fairness in classification settings. Classification performance was measured using accuracy and F1 based on training univariate decision trees, and fairness was measured using equalized odd differences and statistical parity. To evaluate the impact of fairness versus performance, these measures were evaluated against decision tree depth. The results show that the proposed methods were able to determine optimal points, whereby fairness was increased without decreasing performance, thus mitigating any potential performance–fairness tradeoff.

Publisher

MDPI AG

Link

https://www.mdpi.com/2079-9292/13/15/3024/pdf

Reference64 articles.

1. A decade of studying implicit racial/ethnic bias in healthcare providers using the implicit association test;Maina;Soc. Sci. Med.,2018

2. Salimi, B., Rodriguez, L., Howe, B., and Suciu, D. (July, January 30). Causal database repair for algorithmic fairness. Proceedings of the 2019 International Conference on Management of Data, Amsterdam, The Netherlands.

3. Algorithmic Bias: Review, Synthesis, and Future Research Directions;Kordzadeh;Eur. J. Inf. Syst.,2022

4. A review on fairness in machine learning;Pessach;ACM Comput. Surv.,2022

5. Learning optimal and fair decision trees for non-discriminative decision-making;Aghaei;Proc. AAAI Conf. Artif. Intell.,2019