Ensemble methods such as bagging and random forests are ubiquitous in various fields, from finance to genomics. Despite their prevalence, the question of the efficient tuning of ensemble parameters has received relatively little attention. This paper introduces a cross-validation method, ECV (Extrapolated Cross-Validation), for tuning the ensemble and subsample sizes in randomized ensembles. Our method builds on two primary ingredients: initial estimators for small ensemble sizes using out-of-bag errors and a novel risk extrapolation technique that leverages the structure of prediction risk decomposition. By establishing uniform consistency of our risk extrapolation technique over ensemble and subsample sizes, we show that ECV yields -optimal (with respect to the oracle-tuned risk) ensembles for squared prediction risk. Our theory accommodates general predictors, only requires mild moment assumptions, and allows for high-dimensional regimes where the feature dimension grows with the sample size. As a practical case study, we employ ECV to predict surface protein abundances from gene expressions in single-cell multiomics using random forests under a computational constraint on the maximum ensemble size. Compared to sample-split and -fold cross-validation, ECV achieves higher accuracy by avoiding sample splitting. Meanwhile, its computational cost is considerably lower owing to the use of the risk extrapolation technique.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11492369 | PMC |
http://dx.doi.org/10.1080/10618600.2023.2288194 | DOI Listing |
BMJ Open
January 2025
Experimental and Clinical Research Center, Charité - Universitätsmedizin Berlin and Max Delbrück Center for Molecular Medicine, Berlin, Germany.
Introduction: Cardiovascular diseases (CVDs) present differently in women and men, influenced by host-microbiome interactions. The roles of sex hormones in CVD outcomes and gut microbiome in modifying these effects are poorly understood. The XCVD study examines gut microbiome mediation of sex hormone effects on CVD risk markers by observing transgender participants undergoing gender-affirming hormone therapy (GAHT), with findings expected to extrapolate to cisgender populations.
View Article and Find Full Text PDFKnee
January 2025
Keck School of Medicine of USC, Department of Orthopaedic Surgery, Los Angeles, CA, USA.
Background: To present rates of reporting bias in systematic reviews and meta-analyses investigating meniscal root repair.
Methods: In this systematic review, PubMed, Scopus and Web of Science databases were queried for studies that investigated meniscal root tears treated with root repair. Included studies were systematic reviews and/or meta-analyses published in peer-reviewed journals in the English language with available full-texts.
Alzheimers Dement
December 2024
Amsterdam Neuroscience, Neurodegeneration, Amsterdam, Netherlands.
Background: Survival estimates for individuals with Alzheimer's disease (AD) are informative to understand the full disease trajectory. A previous meta-analysis estimated the mean survival of AD patients at 5.8 years from diagnosis, but precise estimates for atypical AD variants are scarce.
View Article and Find Full Text PDFToxicol Ind Health
January 2025
Cincinnati, OH, USA.
(E)-1,1,1,2,2,5,5,6,6,6-Decafluoro-3-hexene (HFO-153-10mczz-E) (CASRN 1256353-26-0) is a volatile liquid proposed for use as a new low global-warming potential dielectric fluid in cooling applications. Workplace exposures are expected to be by inhalation exposure. The substance has low acute inhalation toxicity as indicated by a 4-h inhalation LC value of approximately 8000 ppm.
View Article and Find Full Text PDFAlzheimers Dement
December 2024
Artificial Intelligence in Biomedical Imaging Laboratory (AIBIL), Center for and Data Science for Integrated Diagnostics (AI2D), Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
Background: The Spatial Pattern of Abnormality for REcognition of Alzheimer's Disease (SPARE-AD) index ∖citep{davatzikos2009longitudinal} is one such marker that robustly discriminates between early brain changes observed in cognitively normal aging (CN), mild cognitive impairment (MCI), and Alzheimer's Disease (AD) phenotypes. The adoption of such markers to the clinical setting combined with the ability to forecast their future trajectories would be of great value during clinical assessment, and could improve clinical trial design through targeted risk stratification.
Method: Subjects scanned using the same scanner with more than four longitudinal MRI acquisitions from the Alzheimer's Disease Neuroimaging Initiative (ADNI) and Baltimore Longitudinal Study of Aging (BLSA) study cohorts were used for method development.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!