X-Hacking: The Threat of Misguided AutoML

Authors: Rahul Sharma, Sumantrak Mukherjee, Andrea Sipka, Eyke Hüllermeier, Sebastian Vollmer, Sergey Redyuk, David Antony Selby

Published in: 42nd International Conference on Machine Learning (ICML) 2025. International Conference on Machine Learning (ICML-2025), July 13-19, Vancouver, BC, Canada, ICML, 2025. (2025) https://icml.cc/virtual/2025/poster/46106

Abstract

Explainable AI (XAI) and interpretable machine learning methods help to build trust in model predictions and derived insights, yet also present a perverse incentive for analysts to manipulate XAI metrics to support pre-specified conclusions. This paper introduces the concept of X-hacking, a form of p-hacking applied to XAI metrics such as Shap values. We show how easily an automated machine learning pipeline can be adapted to exploit model multiplicity at scale: searching a set of ‘defensible’ models with similar predictive performance to find a desired explanation. We formulate the trade-off between explanation and accuracy as a multi-objective optimisation problem, and illustrate empirically on familiar real-world datasets that, on average, Bayesian optimisation accelerates X-hacking 3-fold for features susceptible to it, versus random sampling. We show the vulnerability of a dataset to X-hacking can be determined by information redundancy among features. Finally, we suggest possible methods for detection and prevention, and discuss ethical implications for the credibility and reproducibility of XAI.

Centered Button

@inproceedings{sharma2025,
  title     = {X-Hacking: The Threat of Misguided AutoML},
  author    = {Rahul Sharma and Sumantrak Mukherjee and Andrea Sipka and Eyke Hüllermeier and Sebastian Vollmer and Sergey Redyuk and David Antony Selby},
  year      = {2025},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  url       = {https://icml.cc/virtual/2025/poster/46106},
  abstract  = {Explainable AI (XAI) and interpretable machine learning methods help to build trust in model predictions and derived insights, yet also present a perverse incentive for analysts to manipulate XAI metrics to support pre-specified conclusions. This paper introduces the concept of X-hacking, a form of p-hacking applied to XAI metrics such as Shap values. We show how easily an automated machine learning pipeline can be adapted to exploit model multiplicity at scale: searching a set of ‘defensible’ models with similar predictive performance to find a desired explanation. We formulate the trade-off between explanation and accuracy as a multi-objective optimisation problem, and illustrate empirically on familiar real-world datasets that, on average, Bayesian optimisation accelerates X-hacking 3-fold for features susceptible to it, versus random sampling. We show the vulnerability of a dataset to X-hacking can be determined by information redundancy among features. Finally, we suggest possible methods for detection and prevention, and discuss ethical implications for the credibility and reproducibility of XAI.},
}

.st0{fill:#fce9ea;} .st1{fill:#fce9ea;} .st2{fill:#fce9ea;} .st3{fill:#fce9ea;} Publication

X-Hacking: The Threat of Misguided AutoML

Abstract

Publication