A Gibbs Posterior Framework for Fair Clustering-Reference-Cited by-同舟云学术

A Gibbs Posterior Framework for Fair Clustering

Published:2024-01-11 Issue:1 Volume:26 Page:63
ISSN:1099-4300
Container-title:Entropy
language:en
Short-container-title:Entropy

Author:

Chakraborty Abhisek¹,Bhattacharya Anirban¹,Pati Debdeep¹

Affiliation:

1. Department of Statistics, Texas A&M University, College Station, TX 77843, USA

Abstract

The rise of machine learning-driven decision-making has sparked a growing emphasis on algorithmic fairness. Within the realm of clustering, the notion of balance is utilized as a criterion for attaining fairness, which characterizes a clustering mechanism as fair when the resulting clusters maintain a consistent proportion of observations representing individuals from distinct groups delineated by protected attributes. Building on this idea, the literature has rapidly incorporated a myriad of extensions, devising fair versions of the existing frequentist clustering algorithms, e.g., k-means, k-medioids, etc., that aim at minimizing specific loss functions. These approaches lack uncertainty quantification associated with the optimal clustering configuration and only provide clustering boundaries without quantifying the probabilities associated with each observation belonging to the different clusters. In this article, we intend to offer a novel probabilistic formulation of the fair clustering problem that facilitates valid uncertainty quantification even under mild model misspecifications, without incurring substantial computational overhead. Mixture model-based fair clustering frameworks facilitate automatic uncertainty quantification, but tend to showcase brittleness under model misspecification and involve significant computational challenges. To circumnavigate such issues, we propose a generalized Bayesian fair clustering framework that inherently enjoys decision-theoretic interpretation. Moreover, we devise efficient computational algorithms that crucially leverage techniques from the existing literature on optimal transport and clustering based on loss functions. The gain from the proposed technology is showcased via numerical experiments and real data examples.

Publisher

MDPI AG

Link

https://www.mdpi.com/1099-4300/26/1/63/pdf

Reference50 articles.

1. Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R. (2017). Proceedings of the Advances in Neural Information Processing Systems, Curran Associates, Inc.

2. Böhm, M., Fazzone, A., Leonardi, S., and Schwiegelshohn, C. (2020). Fair Clustering with Multiple Colors. arXiv.

3. Probabilistic Fair Clustering;Larochelle;Proceedings of the Advances in Neural Information Processing Systems,2020

4. Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., and Garnett, R. (2019). Proceedings of the Advances in Neural Information Processing Systems, Curran Associates, Inc.

5. Kleindessner, M., Samadi, S., Awasthi, P., and Morgenstern, J. (2019). Guarantees for Spectral Clustering with Fairness Constraints. arXiv.