Aggressively optimizing validation statistics can degrade interpretability of data-driven materials models.

Katherine Lei Howie Joress Nils Persson Jason R Hattrick-Simpers Brian DeCost

J Chem Phys

Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899, USA.

Published: August 2021

One of the key factors in enabling trust in artificial intelligence within the materials science community is the interpretability (or explainability) of the underlying models used. By understanding what features were used to generate predictions, scientists are then able to critically evaluate the credibility of the predictions and gain new insights. Here, we demonstrate that ignoring hyperparameters viewed as less impactful to the overall model performance can deprecate model explainability. Specifically, we demonstrate that random forest models trained using unconstrained maximum depths, in accordance with accepted best practices, often can report a randomly generated feature as being one of the most important features in generated predictions for classifying an alloy as being a high entropy alloy. We demonstrate that this is the case for impurity, permutation, and Shapley importance rankings, and the latter two showed no strong structure in terms of optimal hyperparameters. Furthermore, we demonstrate that, for the case of impurity importance rankings, only optimizing the validation accuracy, as is also considered standard in the random forest community, yields models that prefer the random feature in generating their predictions. We show that by adopting a Pareto optimization strategy to model performance that balances validation statistics with the differences between the training and validation statistics, one obtains models that reject random features and thus balance model predictive power and explainability.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11165882	PMC
http://dx.doi.org/10.1063/5.0050885	DOI Listing

Publication Analysis

Top Keywords

validation statistics

optimizing validation

model performance

random forest

demonstrate case

case impurity

models

aggressively optimizing

validation

statistics degrade

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!