Home  | Publications | ZKC+25

Probabilistic Framework for Robustness of Counterfactual Explanations Under Data Shifts

MCML Authors

Abstract

Counterfactual explanations (CEs) are a powerful method for interpreting machine learning models, but CEs might be not valid when the model is updated due to distribution shifts in the underlying data. Existing approaches to robust CEs often impose explicit bounds on model parameters to ensure stability, but such bounds can be difficult to estimate and overly restrictive in practice. In this work, we propose a data shift-driven probabilistic framework for robust counterfactual explanations with plausible data shift modeling via a Wasserstein ball. We formalize a linearized Wasserstein perturbation scheme that captures realistic distributional changes which enables Monte Carlo estimation of CE robustness probabilities with domain-specific data shift tolerances. Theoretical analysis reveals that our framework is equivalent in spirit to model parameter bounding approaches but offers greater flexibility, avoids the need to estimate maximal model parameter shifts. Experiments on real-world datasets demonstrate that the proposed method maintains high robustness of CEs under plausible distribution shifts, outperforming conventional parameter-bounding techniques in both validity and proximity costs.

inproceedings ZKC+25


ReliableML @NeurIPS 2025

Workshop on Reliable ML from Unreliable Data at the 39th Conference on Neural Information Processing Systems. San Diego, CA, USA, Nov 30-Dec 07, 2025. To be published. Preprint available.

Authors

X. Zhao • L. Krieger • Z. Cao • A. Bangun • H. Scharr • I. Assent

Links

URL

Research Area

 A3 | Computational Models

BibTeXKey: ZKC+25

Back to Top