Home | Publications | TMH+23

Identifying Trends in Feature Attributions During Training of Neural Networks

MCML Authors

Maximilian Muschalik

→ Group Eyke Hüllermeier
Artificial Intelligence and Machine Learning

Paul Hofman

→ Group Eyke Hüllermeier
Artificial Intelligence and Machine Learning

Eyke Hüllermeier

Prof. Dr.

Principal Investigator

Artificial Intelligence and Machine Learning

Abstract

This study investigates the evolving dynamics of commonly used feature attribution (FA) values during training of neural networks. As models transition from a state of high uncertainty to low uncertainty, we show that the features’ significance also changes, which is inline with the general learning theory of deep neural networks. During model training, we compute FA scores through Layer-wise Relevance Propagation (LRP) and Gradient-weighted Class Activation Mapping (Grad-CAM), which are selected for their efficiency and speed of computation. We summarize the attribution scores in terms of the sum of the absolute values of FA scores and their entropy. We further analyze these summary scores in relation to the models’ generalization capabilities. The analysis identifies trends where FA values increase in magnitude while entropy decreases during the training process, regardless of model generalization, suggesting independence of overfitting. This research offers a unique view on the application of FA methods in explainable artificial intelligence (XAI) and raises intriguing questions about their behavior across varying model architectures and datasets, which may have implications for future work combining XAI and uncertainty estimation in machine learning.

inproceedings TMH+23

Uncertainty meets Explainability @ECML-PKDD 2023

Workshop Uncertainty meets Explainability in Machine Learning at the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. Turin, Italy, Sep 18-22, 2023.