The harmful impacts of algorithmic decision systems have recently come into focus, with many examples of machine learning (ML) models amplifying societal biases. In this paper, we propose adapting income inequality metrics from economics to complement existing model-level fairness metrics, which focus on intergroup differences of model performance. In particular, we evaluate their ability to measure disparities between exposures that individuals receive in a production recommendation system, the Twitter algorithmic timeline. We define desirable criteria for metrics to be used in an operational setting by ML practitioners. We characterize engagements with content on Twitter using these metrics and use the results to evaluate the metrics with respect to our criteria. We also show that we can use these metrics to identify content suggestion algorithms that contribute more strongly to skewed outcomes between users. Overall, we conclude that these metrics can be a useful tool for auditing algorithms in production settings.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9403369 | PMC |
http://dx.doi.org/10.1016/j.patter.2022.100568 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!