When evaluating the performance of clinical machine learning models, one must consider the deployment population. When the population of patients with observed labels is only a subset of the deployment population (label selection), standard model performance estimates on the observed population may be misleading. In this study we describe three classes of label selection and simulate five causally distinct scenarios to assess how particular selection mechanisms bias a suite of commonly reported binary machine learning model performance metrics. Simulations reveal that when selection is affected by observed features, naive estimates of model discrimination may be misleading. When selection is affected by labels, naive estimates of calibration fail to reflect reality. We borrow traditional weighting estimators from causal inference literature and find that when selection probabilities are properly specified, they recover full population estimates. We then tackle the real-world task of monitoring the performance of deployed machine learning models whose interactions with clinicians feed-back and affect the selection mechanism of the labels. We train three machine learning models to flag low-yield laboratory diagnostics, and simulate their intended consequence of reducing wasteful laboratory utilization. We find that naive estimates of AUROC on the observed population undershoot actual performance by up to 20%. Such a disparity could be large enough to lead to the wrongful termination of a successful clinical decision support tool. We propose an altered deployment procedure, one that combines injected randomization with traditional weighted estimates, and find it recovers true model performance.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10283136 | PMC |
Soft Matter
January 2025
School of Environmental, Civil, Agricultural and Mechanical Engineering, College of Engineering, University of Georgia, Athens, GA 30602, USA.
The surface morphology of the developing mammalian brain is crucial for understanding brain function and dysfunction. Computational modeling offers valuable insights into the underlying mechanisms for early brain folding. Recent findings indicate significant regional variations in brain tissue growth, while the role of these variations in cortical development remains unclear.
View Article and Find Full Text PDFJ Alzheimers Dis
January 2025
Department of Neurology and the Franke Barrow Global Neuroscience Education Center, Barrow Neurological Institute, Phoenix, AZ, USA.
Background: The aim of this study was to examine the potential added value of including neuropsychiatric symptoms (NPS) in machine learning (ML) models, along with demographic features and Alzheimer's disease (AD) biomarkers, to predict decline or non-decline in global and domain-specific cognitive scores among community-dwelling older adults.
Objective: To evaluate the impact of adding NPS to AD biomarkers on ML model accuracy in predicting cognitive decline among older adults.
Methods: The study was conducted in the setting of the Mayo Clinic Study of Aging, including participants aged ≥ 50 years with information on demographics (i.
Endocr Metab Immune Disord Drug Targets
January 2025
Department of Orthopaedic Surgery, Beijing Chaoyang Hospital, Capital Medical University, Beijing 100020, China.
Background: Osteoporosis (OP) is a skeletal condition characterized by increased susceptibility to fractures. Programmed cell death (PCD) is the orderly process of cells ending their own life that has not been thoroughly explored in relation to OP.
Objective: This study is to investigate PCD-related genes in OP, shedding light on potential mechanisms underlying the disease.
EClinicalMedicine
January 2025
Medical Laboratory CSD, Kyiv 02000, Ukraine.
Background: Although the number of studies reporting war-induced effects on the health of the Ukrainian population has been growing, there are still little data on assessing patients with type 2 diabetes (T2D) during the war. This study aimed to evaluate the impact of war on T2D patients' health to define key risk factors promoting disease progression.
Methods: A survey covering various aspects of T2D patients' experience and glycemic control data was conducted from June 2022 to February 2024.
Cureus
December 2024
Department of Technology and Clinical Trials, Advanced Research, Deerfield Beach, USA.
This paper investigates the potential of artificial intelligence (AI) and machine learning (ML) to enhance the differentiation of cystic lesions in the sellar region, such as pituitary adenomas, Rathke cleft cysts (RCCs) and craniopharyngiomas (CP), through the use of advanced neuroimaging techniques, particularly magnetic resonance imaging (MRI). The goal is to explore how AI-driven models, including convolutional neural networks (CNNs), deep learning, and ensemble methods, can overcome the limitations of traditional diagnostic approaches, providing more accurate and early differentiation of these lesions. The review incorporates findings from critical studies, such as using the Open Access Series of Imaging Studies (OASIS) dataset (Kaggle, San Francisco, USA) for MRI-based brain research, highlighting the significance of statistical rigor and automated segmentation in developing reliable AI models.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!