Extensive social science research has long documented how individual- and location-level factors influence refugee integration. A more recent, smaller line of computational research has introduced algorithmic tools that predict refugees' integration outcomes across potential resettlement locations to inform placement decisions. However, these tools, currently being piloted in several countries, raise major concerns. They rely on a narrow set of predictors, many of which are considered protected attributes under anti-discrimination law, while omitting significant findings from explanatory migration research that could improve model predictions and reliability. Against this background, we draw on a systematic review of empirical migration research and comprehensive refugee panel data in Germany to improve the algorithmic modeling of refugees' integration outcomes. Specifically, we develop models that integrate and test a wide range of migration research variables to predict the economic integration of refugees arriving in Germany in 2016 and 2017. We compare our extended models to existing baselines from the algorithmic matching literature, evaluating both classification performance and fairness. Our results demonstrate that substituting proxy features with theory-driven variables which could be surveyed at arrival can considerably improve both accuracy and fairness without adversely impacting downstream allocation performance. We conclude that integrating insights from empirical migration research is essential for developing more reliable and robust algorithmic matching tools.
inproceedings
BibTeXKey: SNK25a