Background: Low birthweight (LBW) is a leading cause of neonatal mortality in the United States and a major causative factor of adverse health effects in newborns. Identifying high-risk patients early in prenatal care is crucial to preventing adverse outcomes. Previous studies have proposed various machine learning (ML) models for LBW prediction task, but they were limited by small and imbalanced data sets. Some authors attempted to address this through different data rebalancing methods. However, most of their reported performances did not reflect the models' actual performance in real-life scenarios. To date, few studies have successfully benchmarked the performance of ML models in maternal health; thus, it is critical to establish benchmarks to advance ML use to subsequently improve birth outcomes.
Objective: This study aimed to establish several key benchmarking ML models to predict LBW and systematically apply different rebalancing optimization methods to a large-scale and extremely imbalanced all-payer hospital record data set that connects mother and baby data at a state level in the United States. We also performed feature importance analysis to identify the most contributing features in the LBW classification task, which can aid in targeted intervention.
Methods: Our large data set consisted of 266,687 birth records across 6 years, and 8.63% (n=23,019) of records were labeled as LBW. To set up benchmarking ML models to predict LBW, we applied 7 classic ML models (ie, logistic regression, naive Bayes, random forest, extreme gradient boosting, adaptive boosting, multilayer perceptron, and sequential artificial neural network) while using 4 different data rebalancing methods: random undersampling, random oversampling, synthetic minority oversampling technique, and weight rebalancing. Owing to ethical considerations, in addition to ML evaluation metrics, we primarily used recall to evaluate model performance, indicating the number of correctly predicted LBW cases out of all actual LBW cases, as false negative health care outcomes could be fatal. We further analyzed feature importance to explore the degree to which each feature contributed to ML model prediction among our best-performing models.
Results: We found that extreme gradient boosting achieved the highest recall score-0.70-using the weight rebalancing method. Our results showed that various data rebalancing methods improved the prediction performance of the LBW group substantially. From the feature importance analysis, maternal race, age, payment source, sum of predelivery emergency department and inpatient hospitalizations, predelivery disease profile, and different social vulnerability index components were important risk factors associated with LBW.
Conclusions: Our findings establish useful ML benchmarks to improve birth outcomes in the maternal health domain. They are informative to identify the minority class (ie, LBW) based on an extremely imbalanced data set, which may guide the development of personalized LBW early prevention, clinical interventions, and statewide maternal and infant health policy changes.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10267797 | PMC |
http://dx.doi.org/10.2196/44081 | DOI Listing |
J AAPOS
January 2025
Pediatric Vision Laboratory, Retina Foundation of the Southwest, Dallas, Texas; Optometry & Vision Science, University of Waterloo, Waterloo, Ontario, Canada.
Background: One rationale for dichoptic amblyopia therapy is that it may promote recovery of binocular function. Yet data on binocular outcomes in anisometropic amblyopia following dichoptic therapy are sparse. We report factors associated with pre- and post-treatment binocular function in anisometropic amblyopia, and examine binocular function in children who recover normal visual acuity compared to those with residual amblyopia.
View Article and Find Full Text PDFGeriatr Nurs
January 2025
Ordine delle Professioni Infermieristiche di Bergamo, via Pietro Rovelli 45, Bergamo 24125, Italy.
Introduction/objective: The relationship between staffing levels and skill mix in nursing homes is poorly documented in Italy. This study aimed to investigate nursing staffing levels and skill mix in Northern Italian nursing homes.
Methods: A cross-sectional observational study was conducted using a questionnaire sent to several nursing homes.
Food Funct
January 2025
School of Life Sciences, Nanchang University, Nanchang 330031, China.
Systemic lupus erythematosus (SLE) is a chronic autoimmune disease often treated with glucocorticoids, which can lead to complications such as osteoporosis and an increased infection risk. Hence, identifying safe and effective treatment strategies is crucial. has shown promise in improving immune disorders.
View Article and Find Full Text PDFData Brief
December 2024
Department of Computer Science, University of Sheffield, UK.
This paper presents the Cadenza Woodwind Dataset. This publicly available data is synthesised audio for woodwind quartets including renderings of each instrument in isolation. The data was created to be used as training data within Cadenza's second open machine learning challenge (CAD2) for the task on rebalancing classical music ensembles.
View Article and Find Full Text PDFJ Pain Symptom Manage
December 2024
Dana-Farber Cancer Institute, Boston, MA, USA; Harvard Medical School, Boston, MA, USA.
Context: Timely and transparent serious illness conversations (SIC) between family caregivers and patients facilitate high-quality end-of-life care and patients' and family caregivers' mental wellbeing, but frequently do not occur, happen too late, or are incomplete. While social relations and roles shape communication, few studies explore their influence on patient-caregiver SICs.
Objectives: Investigate how the parent-child relation and roles shape SICs between cancer patients and their adult-child caregivers (ACC), the largest caregiver population.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!