Electronic Health Records (EHRs) are increasingly used to develop machine learning models in predictive medicine. There has been limited research on utilizing machine learning methods to predict childhood obesity and related disparities in classifier performance among vulnerable patient subpopulations. In this work, classification models are developed to recognize pediatric obesity using temporal condition patterns obtained from patient EHR data in a U.S. study population. We trained four machine learning algorithms (Logistic Regression, Random Forest, Gradient Boosted Trees, and Neural Networks) to classify cases and controls as obesity positive or negative, and optimized hyperparameter settings through a bootstrapping methodology. To assess the classifiers for bias, we studied model performance by population subgroups then used permutation analysis to identify the most predictive features for each model and the demographic characteristics of patients with these features. Mean AUC-ROC values were consistent across classifiers, ranging from 0.72-0.80. Some evidence of bias was identified, although this was through the models performing better for minority subgroups (African Americans and patients enrolled in Medicaid). Permutation analysis revealed that patients from vulnerable population subgroups were over-represented among patients with the most predictive diagnostic patterns. We hypothesize that our models performed better on under-represented groups because the features more strongly associated with obesity were more commonly observed among minority patients. These findings highlight the complex ways that bias may arise in machine learning models and can be incorporated into future research to develop a thorough analytical approach to identify and mitigate bias that may arise from features and within EHR datasets when developing more equitable models.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11498669 | PMC |
http://dx.doi.org/10.1371/journal.pdig.0000642 | DOI Listing |
Comput Biol Med
January 2025
Emerging Technologies Research Lab (ETRL), College of Computer Science and Information Systems, Najran University, Najran, 61441, Saudi Arabia; Department of Computer Science, College of Computer Science and Information Systems, Najran University, Najran, 61441, Saudi Arabia. Electronic address:
- Brain tumors (BT), both benign and malignant, pose a substantial impact on human health and need precise and early detection for successful treatment. Analysing magnetic resonance imaging (MRI) image is a common method for BT diagnosis and segmentation, yet misdiagnoses yield effective medical responses, impacting patient survival rates. Recent technological advancements have popularized deep learning-based medical image analysis, leveraging transfer learning to reuse pre-trained models for various applications.
View Article and Find Full Text PDFJ Chem Theory Comput
January 2025
State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China.
Molecular docking is a crucial technique for elucidating protein-ligand interactions. Machine learning-based docking methods offer promising advantages over traditional approaches, with significant potential for further development. However, many current machine learning-based methods face challenges in ensuring the physical plausibility of generated docking poses.
View Article and Find Full Text PDFJ Chem Inf Model
January 2025
Department of Chemical Engineering, National Taiwan University, No. 1, Section 4, Roosevelt Road, Taipei 10617, Taiwan.
Accurately predicting activation energies is crucial for understanding chemical reactions and modeling complex reaction systems. However, the high computational cost of quantum chemistry methods often limits the feasibility of large-scale studies, leading to a scarcity of high-quality activation energy data. In this work, we explore and compare three innovative approaches (transfer learning, delta learning, and feature engineering) to enhance the accuracy of activation energy predictions using graph neural networks, specifically focusing on methods that incorporate low-cost, low-level computational data.
View Article and Find Full Text PDFExpert Opin Drug Discov
January 2025
Department of Radiology, Molecular Imaging Innovations Institute (MI3), Weill Cornell Medicine, New York, NY, USA.
Introduction: Technological advancements in virtual screening (VS) have rapidly accelerated its application in drug discovery, as reflected by the exponential growth in VS-related publications. However, a significant gap remains between the volume of computational predictions and their experimental validation. This discrepancy has led to a rise in the number of unverified 'claimed' hits which impedes the drug discovery efforts.
View Article and Find Full Text PDFJ Clin Exp Neuropsychol
January 2025
Department of Psychology, University of Illinois at Urbana-Champaign, Champaign, IL, USA.
Introduction: Diagnostic evaluations for attention-deficit/hyperactivity disorder (ADHD) are becoming increasingly complicated by the number of adults who fabricate or exaggerate symptoms. Novel methods are needed to improve the assessment process required to detect these noncredible symptoms. The present study investigated whether unsupervised machine learning (ML) could serve as one such method, and detect noncredible symptom reporting in adults undergoing ADHD evaluations.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!