Objective: This study leverages the rich diversity of the All of Us Research Program (All of Us)'s dataset to devise a predictive model for cardiovascular disease (CVD) in breast cancer (BC) survivors. Central to this endeavor is the creation of a robust data integration pipeline that synthesizes electronic health records (EHRs), patient surveys, and genomic data, while upholding fairness across demographic variables.
Materials And Methods: We have developed a universal data wrangling pipeline to process and merge heterogeneous data sources of the All of Us dataset, address missingness and variance in data, and align disparate data modalities into a coherent framework for analysis. Utilizing a composite feature set including EHR, lifestyle, and social determinants of health (SDoH) data, we then employed Adaptive Lasso and Random Forest regression models to predict 6 CVD outcomes. The models were evaluated using the c-index and time-dependent Area Under the Receiver Operating Characteristic Curve over a 10-year period.
Results: The Adaptive Lasso model showed consistent performance across most CVD outcomes, while the Random Forest model excelled particularly in predicting outcomes like transient ischemic attack when incorporating the full multi-model feature set. Feature importance analysis revealed age and previous coronary events as dominant predictors across CVD outcomes, with SDoH clustering labels highlighting the nuanced impact of social factors.
Discussion: The development of both Cox-based predictive model and Random Forest Regression model represents the extensive application of the All of Us, in integrating EHR and patient surveys to enhance precision medicine. And the inclusion of SDoH clustering labels revealed the significant impact of sociobehavioral factors on patient outcomes, emphasizing the importance of comprehensive health determinants in predictive models. Despite these advancements, limitations include the exclusion of genetic data, broad categorization of CVD conditions, and the need for fairness analyses to ensure equitable model performance across diverse populations. Future work should refine clinical and social variable measurements, incorporate advanced imputation techniques, and explore additional predictive algorithms to enhance model precision and fairness.
Conclusion: This study demonstrates the liability of the All of Us's diverse dataset in developing a multi-modality predictive model for CVD in BC survivors risk stratification in oncological survivorship. The data integration pipeline and subsequent predictive models establish a methodological foundation for future research into personalized healthcare.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11631116 | PMC |
http://dx.doi.org/10.1093/jamia/ocae199 | DOI Listing |
Genet Med
December 2024
Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN; Center for Digital Genomic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN; Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN; Department of Psychiatry and Behavioral Sciences, Vanderbilt University Medical Center, Nashville, TN. Electronic address:
Purpose: The value of genetic information for improving the performance of clinical risk prediction models has yielded variable conclusions. Many methodological decisions have the potential to contribute to differential results. We performed multiple modeling experiments integrating clinical and demographic data from electronic health records (EHR) with genetic data to understand which decisions may affect performance.
View Article and Find Full Text PDFSci Rep
December 2024
School of Mechanical and Electrical Engineering, North University of China, Taiyuan, 030051, Shanxi, China.
Due to the sensitivity of the shaped charge jet to standoff and the complexity of its impact under lateral disturbances, this study aims to investigate the dynamic impact evolution of the jet influenced by standoff and lateral disturbances. A finite element model for the dynamic impact of shaped charge jets was established. Dynamic impact experiments were designed and conducted to validate the effectiveness of the numerical simulations.
View Article and Find Full Text PDFSci Rep
December 2024
School of Automation Science and Electrical Engineering, Beihang University, Beijing, 100191, China.
A novel adaptive model-based motion control method for multi-UAV communication relay is proposed, which aims at improving the networks connectivity and the communications performance among a fleet of ground unmanned vehicles. The method addresses the challenge of relay UAVs motion control through joint consideration with unknown multi-user mobility, environmental effects on channel characteristics, unavailable angle-of-arrival data of received signals, and coordination among multiple UAVs. The method consists of two parts: (1) Network connectivity is constructed and communication performance index is defined using the minimum spanning tree in graph theory, which considers both the communication link between ground node and UAV, and the communication link between ground nodes.
View Article and Find Full Text PDFSci Rep
December 2024
School of Civil Engineering, Southeast University, Nanjing, 211189, China.
Collapsible loess soils, known for their significant volume reduction upon the wetting, pose critical challenges in the geotechnical engineering. The estimation of the wetting-induced settlement is crucial for the foundation design and the determination of the negative skin friction on the pile. In this paper, a new method is proposed to estimate the wetting induced collapse from the wetting soil-water characteristic curve (SWCC) and the index properties of the loess soils.
View Article and Find Full Text PDFBehav Res Methods
December 2024
ETSI de Telecomunicación, Universidad Politécnica de Madrid, Avenida Complutense, 30, 28040, Madrid, Spain.
This study investigates the potential of large language models (LLMs) to estimate the familiarity of words and multi-word expressions (MWEs). We validated LLM estimates for isolated words using existing human familiarity ratings and found strong correlations. LLM familiarity estimates performed even better in predicting lexical decision and naming performance in megastudies than the best available word frequency measures.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!