Home | Publications | DBB25

Interpretable Machine Learning for Official Statistics

MCML Authors

Susanne Dandl

Dr.

* Former Member

→ Group Bernd Bischl
Statistical Learning and Data Science

Bernd Bischl

Prof. Dr.

Director

Statistical Learning and Data Science

Ludwig Bothmann

Dr.

→ Group Bernd Bischl
Statistical Learning and Data Science

Abstract

This chapter examines the domain of interpretable machine learning with a focus on methods and tools that enhance the transparency and interpretability of machine learning models, specifically within the context of official statistics. We explore various post hoc interpretation methods that provide insights into model behavior, such as loss-based feature importance methods, where we give guidance on how to choose between different such methods. Additionally, we discuss counterfactual and semi-factual explanations (SFEs) and explicitly describe interpretable regional descriptors as an example of a method generating SFEs. Furthermore, we highlight innovative contributions such as the R package mlr3summary, which facilitates model diagnostics and performance evaluation for black-box machine learning models through resampling strategies. This package provides a unified, model-agnostic summary of machine learning models, which facilitates model selection and comparison. Finally, we discuss the importance of ethical considerations and fairness in machine learning applications, emphasizing the need for methods that can address historical biases in data. The objective of this chapter is to provide practitioners and researchers with the practical tools and comprehensive knowledge required to implement interpretable and fair machine learning models, in a variety of domains—but especially in official statistics.

article DBB25