Exploring and Interacting with the Set of Good Sparse Generalized Additive Models.

Chudi Zhong Zhi Chen Jiachang Liu Margo Seltzer Cynthia Rudin

Adv Neural Inf Process Syst

Published: December 2023

In real applications, interaction between machine learning models and domain experts is critical; however, the classical machine learning paradigm that usually produces only a single model does not facilitate such interaction. Approximating and exploring the Rashomon set, i.e., the set of all near-optimal models, addresses this practical challenge by providing the user with a searchable space containing a diverse set of models from which domain experts can choose. We present algorithms to efficiently and accurately approximate the Rashomon set of sparse, generalized additive models with ellipsoids for fixed support sets and use these ellipsoids to approximate Rashomon sets for many different support sets. The approximated Rashomon set serves as a cornerstone to solve practical challenges such as (1) studying the variable importance for the model class; (2) finding models under user-specified constraints (monotonicity, direct editing); and (3) investigating sudden changes in the shape functions. Experiments demonstrate the fidelity of the approximated Rashomon set and its effectiveness in solving practical challenges.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11018320	PMC

Publication Analysis

Top Keywords

rashomon set

sparse generalized

generalized additive

additive models

machine learning

models domain

domain experts

approximate rashomon

support sets

approximated rashomon

Similar Publications

Exploring and Interacting with the Set of Good Sparse Generalized Additive Models.

Adv Neural Inf Process Syst

December 2023

Duke University.

Chudi Zhong Zhi Chen Jiachang Liu Margo Seltzer Cynthia Rudin

View Article and Find Full Text PDF

Similar Publications

A Path to Simpler Models Starts With Noise.

Adv Neural Inf Process Syst

December 2023

Department of Computer Science, Duke University.

Lesia Semenova Harry Chen Ronald Parr Cynthia Rudin

The Rashomon set is the set of models that perform approximately equally well on a given dataset, and the Rashomon ratio is the fraction of all models in a given hypothesis space that are in the Rashomon set. Rashomon ratios are often large for tabular datasets in criminal justice, healthcare, lending, education, and in other areas, which has practical implications about whether simpler models can attain the same level of accuracy as more complex models. An open question is why Rashomon ratios often tend to be large.

View Article and Find Full Text PDF

Similar Publications

The Rashomon Importance Distribution: Getting RID of Unstable, Single Model-based Variable Importance.

ArXiv

April 2024

Jon Donnelly Srikar Katta Cynthia Rudin Edward P Browne

Quantifying variable importance is essential for answering high-stakes questions in fields like genetics, public policy, and medicine. Current methods generally calculate variable importance for a given model trained on a given dataset. However, for a given dataset, there may be many models that explain the target outcome equally well; without accounting for all possible explanations, different researchers may arrive at many conflicting yet equally valid conclusions given the same data.

View Article and Find Full Text PDF

Similar Publications

Exploring the Whole Rashomon Set of Sparse Decision Trees.

Adv Neural Inf Process Syst

January 2022

Duke University.

Rui Xin Chudi Zhong Zhi Chen Takuya Takagi Margo Seltzer

In any given machine learning problem, there might be many models that explain the data almost equally well. However, most learning algorithms return only one of these models, leaving practitioners with no practical way to explore alternative models that might have desirable properties beyond what could be expressed by a loss function. The is the set of these all almost-optimal models.

View Article and Find Full Text PDF

Similar Publications

Reliable Model Selection without Reference Values by Utilizing Model Diversity with Prediction Similarity.

J Chem Inf Model

May 2021

Department of Chemistry, Idaho State University, Pocatello, Idaho 83209, United States.

Robert C Spiers John H Kalivas

Predictive modeling (calibration or training) with various data formats, such as near-infrared (NIR) spectra and quantitative structure-activity relationship (QSAR) data, provides essential information if a proper model is selected. Similarly, with a general model selection approach, spectral model maintenance (updating) from original modeling conditions to new conditions can be performed for dynamic modeling. Fundamental modeling (partial least-squares (PLS) and others) and maintenance processes (domain adaptation or transfer learning and others) require selection of tuning parameter(s) values to isolate models that can accurately predict new samples or molecules, e.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!