Machine learning (ML) models are often based on complex black-box architectures that are difficult to interpret. This interpretability problem can hinder the use of ML in fields like medicine, ecology, and insurance, and has boosted research in interpretable machine learning (IML). Here, we propose a novel approach for the functional decomposition of black-box predictions, which is a core concept of IML. This approach replaces the prediction function with a surrogate model consisting of simpler subfunctions, providing insights into the direction and strength of the main feature contributions and their interactions. Our method is based on a concept termed “stacked orthogonality”, which ensures that the main effects capture as much functional behavior as possible. To compute the subfunctions, we combine neural additive modeling with an efficient post-hoc orthogonalization procedure. Our method yielded plausible results in an analysis of stream biological condition in the Chesapeake Bay watershed (United States).
article KRB+25
BibTeXKey: KRB+25