Affiliation:
1. University College London
2. King’s College London
Abstract
Software bias is an increasingly important operational concern for software engineers. We present a large-scale, comprehensive empirical study of 17 representative bias mitigation methods for Machine Learning (ML) classifiers, evaluated with 11 ML performance metrics (e.g., accuracy), 4 fairness metrics, and 20 types of fairness-performance tradeoff assessment, applied to 8 widely-adopted software decision tasks. The empirical coverage is much more comprehensive, covering the largest numbers of bias mitigation methods, evaluation metrics, and fairness-performance tradeoff measures compared to previous work on this important software property. We find that (1) the bias mitigation methods significantly decrease ML performance in 53% of the studied scenarios (ranging between 42%∼66% according to different ML performance metrics); (2) the bias mitigation methods significantly improve fairness measured by the 4 used metrics in 46% of all the scenarios (ranging between 24%∼59% according to different fairness metrics); (3) the bias mitigation methods even lead to decrease in both fairness and ML performance in 25% of the scenarios; (4) the effectiveness of the bias mitigation methods depends on tasks, models, the choice of protected attributes, and the set of metrics used to assess fairness and ML performance; (5) there is no bias mitigation method that can achieve the best tradeoff in all the scenarios. The best method that we find outperforms other methods in 30% of the scenarios. Researchers and practitioners need to choose the bias mitigation method best suited to their intended application scenario(s).
Funder
EPIC: Evolutionary Program Improvement Collaborators
UKRI Trustworthy Autonomous Systems Node in Verifiability
Publisher
Association for Computing Machinery (ACM)
Reference82 articles.
1. The Adult Census Income dataset. Retrieved September 20 2021 from https://archive.ics.uci.edu/ml/datasets/adult.
2. Applications of scikit-learn. Retrieved November 24 2022 from https://numfocus.org/project/scikit-learn#::text=Implementations%20rely%20either%20on%20vectorized to%20analyzing%20brain%20imaging%20data.
3. The Bank dataset. Retrieved September 20 2021 from https://archive.ics.uci.edu/ml/datasets/Bank+Marketing.
4. The Compas dataset. Retrieved September 20 2021 from https://github.com/propublica/compas-analysis.
5. FATE: Fairness Accountability Transparency and Ethics in AI. Retrieved September 20 2021 fromhttps://www.microsoft.com/en-us/research/theme/fate/.
Cited by
26 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献