Development and validation of machine learning models for predicting venous thromboembolism in colorectal cancer patients: A cohort study in China.

Int J Med Inform

Chongqing Cancer Multiomics Big Data Application Engineering Research Center, Chongqing University Cancer Hospital, Chongqing 400030, China. Electronic address:

Published: December 2024

Background: With advancements in healthcare, traditional VTE risk assessment tools are increasingly insufficient to meet the demands of high-quality care, underscoring the need for innovative and specialized assessment methods.

Objective: Owing to the remarkable success of machine learning in supervised learning and disease prediction, our objective is to develop a reliable and efficient model for assessing VTE risk by leveraging the fundamental data and clinical characteristics of colorectal cancer patients within our medical facility.

Methods: Six commonly used machine learning algorithms were utilized in our study to predict the occurrence of VTE in patients with rectal cancer. In the modeling process, LASSO regression was employed to identify and exclude variables not associated with VTE. Additionally, hyperparameter tuning was conducted via 5-fold cross-validation to mitigate overfitting, and 200 bootstrap samples were used to adjust the apparent performance on the training set. The selection of the VTE assessment model was determined by a thorough evaluation of performance criteria, such as the AUC, ACC and F1 score.

Results: The RF model exhibits consistent and efficient performance. Specifically, in the internally validation dataset, where generalizability was adjusted, the RF model achieved the highest scores across multiple metrics: AD-AUC (0.895), AD-ACC (0.871), AD-F1 (0.311), AD-MCC (0.316), AD-Precision (0.241), AD-Specificity (0.888). For external validation on unseen colon cancer data, the RF model also performed best in terms of ACC (0.728), F1 (0.292), MCC (0.225), Precision (0.192), and Specificity (0.740), with a suboptimal AUC of 0.745 and a Sensitivity (Recall) of 0.615. Additionally, the RF model demonstrates strong performance not only on the original dataset but also on datasets processed via alternative imbalance handling techniques.

Conclusions: Our research successfully established and validated a risk assessment model for assessing the risk of VTE in colorectal cancer patients.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ijmedinf.2024.105770DOI Listing

Publication Analysis

Top Keywords

machine learning
12
colorectal cancer
12
cancer patients
12
vte risk
8
risk assessment
8
model assessing
8
assessment model
8
model
7
vte
6
cancer
5

Similar Publications

Background: Aneuploidy is crucial yet under-explored in cancer pathogenesis. Specifically, the involvement of brain expressed X-linked gene 4 () in microtubule formation has been identified as a potential aneuploidy mechanism. Nevertheless, 's comprehensive impact on aneuploidy incidence across different cancer types remains unexplored.

View Article and Find Full Text PDF

Machine-Learning-Aided Engineering Hemoglobin as Carbene Transferase for Catalyzing Enantioselective Olefin Cyclopropanation.

JACS Au

December 2024

Key Laboratory of Molecular Enzymology and Engineering of Ministry of Education, School of Life Sciences, Jilin University, Changchun 130023, P. R. China.

In this study, we developed a machine-learning-aided protein design strategy for engineering hemoglobin (VHb) as carbene transferase. A Natural Language Processing (NLP) model was used for the first time to construct an algorithm (EESP, enzyme enantioselectivity score predictor) and predict the enantioselectivity of VHb. We identified critical amino acid residue sites by molecular docking and established a simplified mutation library by site-saturated mutagenesis.

View Article and Find Full Text PDF

Objective: A comprehensive bioinformatics analysis was conducted to investigate potential new diagnostic biomarkers and immune infiltration characteristics associated with tubulointerstitial injury in lupus nephritis (LN), and to examine possible correlations between key genes and infiltrating immune cells.

Methods: The GSE32591, GSE113342, and GSE200306 datasets were downloaded from the Gene Expression Omnibus database and differentially expressed genes (DEGs) were identified in the pooled dataset. Support vector machine-recursive feature elimination analysis and the least absolute shrinkage and selection operator regression model were used to screen for possible markers, and the compositional patterns of the 22 types of immune cell fractions in LN were determined using CIBERSORT.

View Article and Find Full Text PDF

Single-Cell Sequencing and Machine Learning Integration to Identify Candidate Biomarkers in Psoriasis: .

J Inflamm Res

December 2024

Department of Dermatology, China-Japan Friendship Hospital, National Center for Integrative Medicine, Beijing, 100029, People's Republic of China.

Background: Psoriasis represents a persistent, immune-driven inflammatory condition affecting the skin, characterized by a lack of well-established biologic treatments without adverse events. Consequently, the identification of novel targets and therapeutic agents remains a pressing priority in the field of psoriasis research.

Methods: We collected single-cell RNA sequencing (scRNA-seq) datasets and inferred T cell differentiation trajectories through pseudotime analysis.

View Article and Find Full Text PDF

Background: Coronary artery bypass grafting (CABG) surgery has been a widely accepted method for treating coronary artery disease. However, its postoperative complications can have a significant effect on long-term patient outcomes. A retrospective study was conducted to identify before and after surgery that contribute to postoperative stroke in patients undergoing CABG, and to develop predictive models and recommendations for single-factor thresholds.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!