Unmasking the chameleons: A benchmark for out-of-distribution detection in medical tabular data.

Int J Med Inform

Department of Medical Informatics, Amsterdam Public Health Research Institute, Amsterdam UMC, University of Amsterdam, the Netherlands; Institute of Logic, Language and Computation, University of Amsterdam, the Netherlands; Pacmed, Amsterdam, the Netherlands. Electronic address:

Published: December 2024

Background: Machine Learning (ML) models often struggle to generalize effectively to data that deviates from the training distribution. This raises significant concerns about the reliability of real-world healthcare systems encountering such inputs known as out-of-distribution (OOD) data. These concerns can be addressed by real-time detection of OOD inputs. While numerous OOD detection approaches have been suggested in other fields - especially in computer vision - it remains unclear whether similar methods effectively address challenges posed by medical tabular data.

Objective: To answer this important question, we propose an extensive reproducible benchmark to compare different OOD detection methods in medical tabular data across a comprehensive suite of tests.

Method: To achieve this, we leverage 4 different and large public medical datasets, including eICU and MIMIC-IV, and consider various kinds of OOD cases within these datasets. For example, we examine OODs originating from a statistically different dataset than the training set according to the membership model introduced by Debray et al. [1], as well as OODs obtained by splitting a given dataset based on a value of a distinguishing variable. To identify OOD instances, we explore a range of 10 density-based methods that learn the marginal distribution of the data, alongside 17 post-hoc detectors that are applied on top of prediction models already trained on the data. The prediction models involve three distinct architectures, namely MLP, ResNet, and Transformer.

Main Results: In our experiments, when the membership model achieved an AUC of 0.98, which indicated a clear distinction between OOD data and the training set, we observed that the OOD detection methods had achieved AUC values exceeding 0.95 in distinguishing OOD data. In contrast, in the experiments with subtler changes in data distribution such as selecting OOD data based on ethnicity and age characteristics, many OOD detection methods performed similarly to a random classifier with AUC values close to 0.5. This may suggest a correlation between separability, as indicated by the membership model, and OOD detection performance, as indicated by the AUC of the detection model. This warrants future research.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ijmedinf.2024.105762DOI Listing

Publication Analysis

Top Keywords

ood detection
20
ood data
16
medical tabular
12
ood
12
detection methods
12
membership model
12
data
10
detection
8
tabular data
8
training set
8

Similar Publications

Identifying transitional states is crucial for understanding protein conformational changes that underlie numerous biological processes. Markov state models (MSMs), built from Molecular Dynamics (MD) simulations, capture these dynamics through transitions among metastable conformational states, and have demonstrated success in studying protein conformational changes. However, MSMs face challenges in identifying transition states, as they partition MD conformations into discrete metastable states (or free energy minima), lacking description of transition states located at the free energy barriers.

View Article and Find Full Text PDF

Towards safe and reliable deep learning for lung nodule malignancy estimation using out-of-distribution detection.

Comput Biol Med

December 2024

Diagnostic Imaging Analysis Group, Medical Imaging Department, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA, Nijmegen, the Netherlands.

Artificial Intelligence (AI) models may fail or suffer from reduced performance when applied to unseen data that differs from the training data distribution, referred to as dataset shift. Automatic detection of out-of-distribution (OOD) data contributes to safe and reliable clinical implementation of AI models. In this study, we propose a recognized OOD detection method that utilizes the Mahalanobis distance (MD) and compare its performance to widely known classical methods.

View Article and Find Full Text PDF

Unmasking the chameleons: A benchmark for out-of-distribution detection in medical tabular data.

Int J Med Inform

December 2024

Department of Medical Informatics, Amsterdam Public Health Research Institute, Amsterdam UMC, University of Amsterdam, the Netherlands; Institute of Logic, Language and Computation, University of Amsterdam, the Netherlands; Pacmed, Amsterdam, the Netherlands. Electronic address:

Background: Machine Learning (ML) models often struggle to generalize effectively to data that deviates from the training distribution. This raises significant concerns about the reliability of real-world healthcare systems encountering such inputs known as out-of-distribution (OOD) data. These concerns can be addressed by real-time detection of OOD inputs.

View Article and Find Full Text PDF

In real-world scenarios, medical image segmentation models encounter input images that may deviate from the training images in various ways. These differences can arise from changes in image scanners and acquisition protocols, or even the images can come from a different modality or domain. When the model encounters these out-of-distribution (OOD) images, it can behave unpredictably.

View Article and Find Full Text PDF
Article Synopsis
  • Multi-layer aggregation is essential for effective out-of-distribution (OOD) detection in deep neural networks, particularly in real-time systems where efficiency matters.
  • A novel early stopping framework with multiple OOD detectors attached to intermediate layers allows for early detection and reduces computational costs by selecting the best layer based on the complexity of the OOD.
  • Experiments show that this approach can boost OOD detection efficiency by up to 99.1% while maintaining high accuracy, making it suitable for practical applications.
View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!