Introduction: This large case-control study explored the application of machine learning models to identify risk factors for primary invasive incident breast cancer (BC) in the Iranian population. This study serves as a bridge toward improved BC prevention, early detection, and management through the identification of modifiable and unmodifiable risk factors.

Methods: The dataset includes 1,009 cases and 1,009 controls, with comprehensive data on lifestyle, health-behavior, reproductive and sociodemographic factors. Different machine learning models, namely Random Forest (RF), Neural Networks (NN), Bootstrap Aggregating Classification and Regression Trees (Bagged CART), and Extreme Gradient Boosting Tree (XGBoost), were employed to analyze the data.

Results: The findings highlight the significance of a chest X-ray history, deliberate weight loss, abortion history, and post-menopausal status as predictors. Factors such as second-hand smoking, lower education, menarche age (>14), occupation (employed), first delivery age (18-23), and breastfeeding duration (>42 months) were also identified as important predictors in multiple models. The RF model exhibited the highest Area Under the Curve (AUC) value of 0.9, as indicated by the Receiver Operating Characteristic (ROC) curve. Following closely was the Bagged CART model with an AUC of 0.89, while the XGBoost model achieved a slightly lower AUC of 0.78. In contrast, the NN model demonstrated the lowest AUC of 0.74. On the other hand, the RF model achieved an accuracy of 83.9% and a Kappa coefficient of 67.8% and the XGBoost, achieved a lower accuracy of 82.5% and a lower Kappa coefficient of 0.6.

Conclusion: This study could be beneficial for targeted preventive measures according to the main risk factors for BC among high-risk women.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10903343PMC
http://dx.doi.org/10.3389/fonc.2023.1276232DOI Listing

Publication Analysis

Top Keywords

machine learning
12
risk factors
12
breast cancer
8
large case-control
8
case-control study
8
learning models
8
bagged cart
8
model achieved
8
achieved lower
8
kappa coefficient
8

Similar Publications

Human-induced global warming, primarily attributed to the rise in atmospheric CO, poses a substantial risk to the survival of humanity. While most research focuses on predicting annual CO emissions, which are crucial for setting long-term emission mitigation targets, the precise prediction of daily CO emissions is equally vital for setting short-term targets. This study examines the performance of 14 models in predicting daily CO emissions data from 1/1/2022 to 30/9/2023 across the top four polluting regions (China, India, the USA, and the EU27&UK).

View Article and Find Full Text PDF

The present study analyzed the impact of age on the causes of death (CODs) in patients with nasopharyngeal carcinoma (NPC) undergoing chemoradiotherapy (CRT) using machine learning approaches. A total of 2841 patients (1037 classified as older, ≥ 60 years and 1804 as younger, < 60 years) were enrolled. Variations in the CODs between the two age groups were analyzed before and after applying inverse probability of treatment weighting (IPTW).

View Article and Find Full Text PDF

Immunotherapy is improving the survival of patients with metastatic non-small cell lung cancer (NSCLC), yet reliable biomarkers are needed to identify responders prospectively and optimize patient care. In this study, we explore the benefits of multimodal approaches to predict immunotherapy outcome using multiple machine learning algorithms and integration strategies. We analyze baseline multimodal data from a cohort of 317 metastatic NSCLC patients treated with first-line immunotherapy, including positron emission tomography images, digitized pathological slides, bulk transcriptomic profiles, and clinical information.

View Article and Find Full Text PDF

Recent evidence indicates that endocrine resistance in estrogen receptor-positive (ER+) breast cancer is closely correlated with phenotypic characteristics of epithelial-to-mesenchymal transition (EMT). Nonetheless, identifying tumor tissues with a mesenchymal phenotype remains challenging in clinical practice. In this study, we validated the correlation between EMT status and resistance to endocrine therapy in ER+ breast cancer from a transcriptomic perspective.

View Article and Find Full Text PDF

Aim: Pre-injury frailty has been investigated as a tool to predict outcomes of older trauma patients. Using artificial intelligence principles of machine learning, we aimed to identify a "signature" (combination of clinical variables) that could predict which older adults are at risk of fall-related hospital admission. We hypothesized that frailty, measured using the 5-item modified Frailty Index, could be utilized in combination with other factors as a predictor of admission for fall-related injuries.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!