What question did this study set out to answer?

The aim is to define and clarify explainability in machine learning, particularly focusing on feature importance.

March 29, 2026Open Access

Explainable AI under scrutiny: from empirical investigations to analytical insights

Key Points

The aim is to define and clarify explainability in machine learning, particularly focusing on feature importance.
Systematic analysis of explainability definitions and goals.
Dichotomization of feature importance into model-centric and data-centric.
Development of benchmarks for assessment based on statistical relationships.
Demonstrated shortcomings of popular XAI methods regarding suppressor variables.
Introduced the Statistical Association Property (SAP) for feature attribution methods.
Showed improvement in explanation performance with complex ML models under specific training conditions.

Abstract

Machine learning (ML) systems have become essential in numerous applications, ranging from the natural sciences to high-risk medical applications. As ML models grow increasingly complex, their decisions have become more opaque to humans, calling for transparency and human-understandability. This has led to the rise of Explainable Artificial Intelligence (XAI) research, which has put forward numerous approaches to these issues. However, the XAI research field lacks clear definitions of the problems it aims to solve, leading to ambiguity about the reliability and correctness of explanation methods, thus limiting their utility. The notion of correctness itself lacks clarity and hinges strongly on the explanation goals set out by researchers. In this work, we carry out a systematic analysis of how to define explainability, where the statistical relationship between features and the prediction target determines our primary explanation goal, which can be viewed as one of multiple equitable goals of explainability. In particular, the notion of feature importance forms the core of our analysis, and questions arise about how to define it while incorporating multiple views and how to reliably and consistently quantify its correctness. In this context, we discuss the shortcomings of widely adopted methods based on the research community’s primary goals: scientific discovery, algorithmic recourse, and validation of models and datasets. Suppressor variables, features that increase the prediction performance of ML models with- out being statistically relevant to the prediction target, constitute a key obstacle to contemporary XAI methods, as they challenge numerous existing goals and purposes of XAI. We utilize this concept to provide a counterexample to analyze the shortcomings of these definitions analyt- ically. By analyzing causal graphs, colliders, we show that the misalignment between model weights and feature-target associations is a structural consequence of the data generation pro- cess. With this insight, we propose a dichotomization of feature importance into model-centric and data-centric importance, leading us to derive a data-driven definition of feature importance. We then leverage this definition to formalize a synthetic ground truth for feature importance, proposing a benchmark for the data-centric correctness assessment of explanation methods based on a linearly solvable classification problem. We refine this benchmark approach by proposing two application-grounded benchmarks and corresponding datasets for natural lan- guage processing and magnetic resonance imaging. Consequently, we define the Statistical Association Property (SAP) for feature attribution methods yielding high importance scores for features that have a genuine statistical relationship with the prediction target. In our theoretical analysis, paired with empirical analysis on synthetic data, we demonstrate that widely adopted XAI methods often fail to fulfill the SAP. Thus, XAI methods are unable to distinguish suppressor variables from other features, which can lead to misinterpretations. As a result, it is essential to incorporate the distributional dependency structure of a given prediction problem into the corresponding explanation task. In application- grounded studies, we found that the explanation performance of complex ML models, which are often used in transfer learning settings, improves with an increased number of fine-tuned layers and when ML models are pre-trained on in-domain datasets. Originating from the ill-defined nature of XAI, we highlight the weaknesses of widely adopted XAI methods and propose SAP a novel necessary condition for feature importance that, unlike faithfulness, allows addressing explanation goals, such as scientific discovery and model and data debugging, from a data-centric point of view.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Rick Wilming

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Explainable AI under scrutiny: from empirical investigations to analytical insights

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider