Affiliation:
1. Department of Information Systems and Management Engineering,
Southern University of Science and Technology, Shenzhen, Guangdong, China.
2. Department of Decisions, Operations and Technology,
The Chinese University of Hong Kong, Hong Kong, China.
Abstract
In this study, we evaluate the “propose-review” framework for the mitigation of bias in machine classification. The framework considers Bob, who aims to protect sensitive dimensions from discrimination, and Alice, who sends proposals to Bob for using his data to construct a target classifier. The goal is to minimize discrimination in Bob’s protected dimension while preserving the desired separating capability of Alice’s classifier. The method does not assume predefined bias terms, does not anchor on specific fairness metrics, and is independent of Alice’s classifier choice. We consider that data attributes have different concentrations of the latent bias axes; assessing attributes’ concentrations in the ruled bias hyperspace helps identify bias-prone attributes and inform bias-mitigating data transforms. To this end, we assess attributes’ contribution to the separating capability of Bob’s conceptual classifier. We then compute the pairwise distances between attributes, and by applying multidimensional scaling to the distance matrix, we infer the axes of bias and establish a bias-attribute mapping. Bias mitigation is achieved by greedily applying appropriate data transforms to bias-prone attributes. The method works desirably across 21 classifiers and 7 datasets, bringing about substantial bias reduction under different choices of the protected dimension and the fairness metric. Compared to adversarial debiasing, the method better exploits the fairness-utility trade-off in machine classification.
Publisher
American Association for the Advancement of Science (AAAS)