Unveiling the Blindspots: Examining Availability and Usage of Protected Attributes in Fairness Datasets
MCML Authors
Abstract
Abstract
This work examines the representation of protected attributes across tabular datasets used in algorithmic fairness research. Drawing from international human rights and anti-discrimination laws, we compile a set of protected attributes and investigate both their availability and usage in the literature. Our analysis reveals a significant underrepresentation of certain attributes in datasets that is exacerbated by a strong focus on race and sex in dataset usage. We identify a geographical bias towards the Global North, particularly North America, potentially limiting the applicability of fairness detection and mitigation strategies in less-represented regions. The study exposes critical blindspots in fairness research, highlighting the need for a more inclusive and representative approach to data collection and usage in the field. We propose a shift away from a narrow focus on a small number of datasets and advocate for initiatives aimed at sourcing more diverse and representative data.
inproceedings SFK24a
EWAF 2024
3rd European Workshop on Algorithmic Fairness. Mainz, Germany, Jul 01-03, 2024.Authors
J. Simson • A. Fabris • C. KernLinks
PDFResearch Area
BibTeXKey: SFK24a