We propose a novel method for enforcing AI fairness with respect to protected or sensitive factors. This method uses a dual strategy performing training and representation alteration (TARA) for the mitigation of prominent causes of AI bias. It includes the use of representation learning alteration via adversarial independence to suppress the bias-inducing dependence of the data representation from protected factors and training set alteration via intelligent augmentation to address bias-causing data imbalance by using generative models that allow the fine control of sensitive factors related to underrepresented populations via domain adaptation and latent space manipulation. When testing our methods on image analytics, experiments demonstrate that TARA significantly or fully debiases baseline models while outperforming competing debiasing methods that have the same amount of information-for example, with (% overall accuracy, % accuracy gap) = (78.8, 0.5) versus the baseline method's score of (71.8, 10.5) for Eye-PACS, and (73.7, 11.8) versus (69.1, 21.7) for CelebA. Furthermore, recognizing certain limitations in current metrics used for assessing debiasing performance, we propose novel conjunctive debiasing metrics. Our experiments also demonstrate the ability of these novel metrics in assessing the Pareto efficiency of the proposed methods.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1162/neco_a_01468 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!