Distributional bias compromises leave-one-out cross-validation.

ArXiv

Program for Mathematical Genomics, Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA.

Published: June 2024

Cross-validation is a common method for estimating the predictive performance of machine learning models. In a data-scarce regime, where one typically wishes to maximize the number of instances used for training the model, an approach called 'leave-one-out cross-validation' is often used. In this design, a separate model is built for predicting each data instance after training on all other instances. Since this results in a single test data point available per model trained, predictions are aggregated across the entire dataset to calculate common rank-based performance metrics such as the area under the receiver operating characteristic or precision-recall curves. In this work, we demonstrate that this approach creates a negative correlation between the average label of each training fold and the label of its corresponding test instance, a phenomenon that we term distributional bias. As machine learning models tend to regress to the mean of their training data, this distributional bias tends to negatively impact performance evaluation and hyperparameter optimization. We show that this effect generalizes to leave-P-out cross-validation and persists across a wide range of modeling and evaluation approaches, and that it can lead to a bias against stronger regularization. To address this, we propose a generalizable rebalanced cross-validation approach that corrects for distributional bias. We demonstrate that our approach improves cross-validation performance evaluation in synthetic simulations and in several published leave-one-out analyses.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11177965PMC

Publication Analysis

Top Keywords

distributional bias
16
machine learning
8
learning models
8
demonstrate approach
8
performance evaluation
8
cross-validation
5
distributional
4
bias compromises
4
compromises leave-one-out
4
leave-one-out cross-validation
4

Similar Publications

Human behavior is strongly influenced by anticipation, but the underlying neural mechanisms are poorly understood. We obtained intracranial electrocephalography (iEEG) measurements in neurosurgical patients as they performed a simple sensory-motor task with variable (short or long) foreperiod delays that affected anticipation of the cue to respond. Participants showed two forms of anticipatory response biases, distinguished by more premature false alarms (FAs) or faster response times (RTs) on long-delay trials.

View Article and Find Full Text PDF

Reducing bias in source-free unsupervised domain adaptation for regression.

Neural Netw

January 2025

Luca Healthcare R&D, Shanghai, 200000, China. Electronic address:

Due to data privacy and storage concerns, Source-Free Unsupervised Domain Adaptation (SFUDA) focuses on improving an unlabelled target domain by leveraging a pre-trained source model without access to source data. While existing studies attempt to train target models by mitigating biases induced by noisy pseudo labels, they often lack theoretical guarantees for fully reducing biases and have predominantly addressed classification tasks rather than regression ones. To address these gaps, our analysis delves into the generalisation error bound of the target model, aiming to understand the intrinsic limitations of pseudo-label-based SFUDA methods.

View Article and Find Full Text PDF

Background And Objectives: Hypertensive Retinopathy (HR) is a retinal manifestation resulting from persistently elevated blood pressure. Severity grading of HR is essential for patient risk stratification, effective management, progression monitoring, timely intervention, and minimizing the risk of vision impairment. Computer-aided diagnosis and artificial intelligence (AI) systems play vital roles in the diagnosis and grading of HR.

View Article and Find Full Text PDF

Impact of a new image enhancement technology on the nuchal translucency thickness.

Arch Gynecol Obstet

January 2025

Department of Obstetrics and Gynaecology, University Hospital Tübingen, University of Tübingen, Calwerstrasse 7, 72076, Tübingen, Germany.

Objective: To examine the impact of a new image enhancement technique on the distribution of NT measurements.

Methods: In this retrospective study, nuchal translucency (NT) images that were taken with the GE Voluson E22 ultrasound machine between May and September 2024 were collected. One operator took manual and automated NT measurements in an NT image without radiant enhancement mode followed by automated measurements in NT images with minimum, medium and maximum radiant mode.

View Article and Find Full Text PDF

Climate change is compelling species to seek refuge at higher elevations and latitudes. While researchers commonly study these migrations using discontinuous elevational transects, this methodology may introduce significant biases into our understanding of species movement. These potential biases could lead to flawed biodiversity conservation policies if left unexamined.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!