Home  | Publications | RNB24

Can Generative AI-Based Data Balancing Mitigate Unfairness Issues in Machine Learning?

MCML Authors

Abstract

Data imbalance in the protected attributes can lead to machine learning models that perform better on the majority than on the minority group, giving rise to unfairness issues. While baseline methods like undersampling or SMOTE can balance datasets, we investigate how methods of generative artificial intelligence compare concerning classical fairness metrics. Using generated fake data, we propose different balancing methods and investigate the behavior of classification models in thorough benchmark studies using German credit and Berkeley admission data. While our experiments suggest that such methods may improve fairness metrics, further investigations are necessary to derive clear practical recommendations.

inproceedings


EWAF 2024

3rd European Workshop on Algorithmic Fairness. Mainz, Germany, Jul 01-03, 2024.

Authors

B. Ronval • S. Nijssen • L. Bothmann

Links

PDF

Research Area

 A1 | Statistical Foundations & Explainability

BibTeXKey: RNB24

Back to Top