Objective: This study aims to construct a predictive model based on machine learning algorithms to assess the risk of prolonged hospital stays post-surgery for colorectal cancer patients and to analyze preoperative and postoperative factors associated with extended hospitalization.

Methods: We prospectively collected clinical data from 83 colorectal cancer patients. The study included 40 variables (comprising 39 predictor variables and 1 target variable). Important variables were identified through variable selection via the Lasso regression algorithm, and predictive models were constructed using ten machine learning models, including Logistic Regression, Decision Tree, Random Forest, Support Vector Machine, Light Gradient Boosting Machine, KNN, and Extreme Gradient Boosting, Categorical Boosting, Artificial Neural Network and Deep Forest. The model performance was evaluated using Bootstrap ROC curves and calibration curves, with the optimal model selected and further interpreted using the SHAP explainability algorithm.

Results: Ten significantly correlated important variables were identified through Lasso regression, validated by 1000 Bootstrap resamplings, and represented through Bootstrap ROC curves. The Logistic Regression model achieved the highest AUC (AUC=0.99, 95% CI=0.97-0.99). The explainable machine learning algorithm revealed that the distance walked on the third day post-surgery was the most important variable for the LR model.

Conclusion: This study successfully constructed a model predicting postoperative hospital stay duration using patients' clinical data. This model promises to provide healthcare professionals with a more precise prediction tool in clinical practice, offering a basis for personalized nursing interventions, thereby improving patient prognosis and quality of life and enhancing the efficiency of medical resource utilization.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11211394PMC
http://dx.doi.org/10.3389/fonc.2024.1384931DOI Listing

Publication Analysis

Top Keywords

machine learning
16
colorectal cancer
12
cancer patients
12
predictive model
8
learning algorithm
8
clinical data
8
variables identified
8
lasso regression
8
logistic regression
8
gradient boosting
8

Similar Publications

Background: Pressure injuries (PIs) place a substantial burden on healthcare systems worldwide. Risk stratification of those who are at risk of developing PIs allows preventive interventions to be focused on patients who are at the highest risk. The considerable number of risk assessment scales and prediction models available underscores the need for a thorough evaluation of their development, validation, and clinical utility.

View Article and Find Full Text PDF

Comparative analysis of regression algorithms for drug response prediction using GDSC dataset.

BMC Res Notes

January 2025

Department of Computer Engineering, Chungbuk National University, Chungdae-ro 1, Cheongju, 28644, Republic of Korea.

Background: Drug response prediction can infer the relationship between an individual's genetic profile and a drug, which can be used to determine the choice of treatment for an individual patient. Prediction of drug response is recently being performed using machine learning technology. However, high-throughput sequencing data produces thousands of features per patient.

View Article and Find Full Text PDF

Supervised machine learning statistical models for visual outcome prediction in macular hole surgery: a single-surgeon, standardized surgery study.

Int J Retina Vitreous

January 2025

Department of Retina and Vitreous, Narayana Nethralaya, #121/C, 1st R Block, Chord Road, Rajaji Nagar, Bengaluru, 560010, India.

Purpose: To evaluate the predictive accuracy of various machine learning (ML) statistical models in forecasting postoperative visual acuity (VA) outcomes following macular hole (MH) surgery using preoperative optical coherence tomography (OCT) parameters.

Methods: This retrospective study included 158 eyes (151 patients) with full-thickness MHs treated between 2017 and 2023 by the same surgeon and using the same intraoperative surgical technique. Data from electronic medical records and OCT scans were extracted, with OCT-derived qualitative and quantitative MH characteristics recorded.

View Article and Find Full Text PDF

Background: This systematic review aims to explore the early predictive value of machine learning (ML) models for the progression of gestational diabetes mellitus (GDM) to type 2 diabetes mellitus (T2DM).

Methods: A comprehensive and systematic search was conducted in Pubmed, Cochrane, Embase, and Web of Science up to July 02, 2024. The quality of the studies included was assessed.

View Article and Find Full Text PDF

Objectives: This data note presents a comprehensive geodatabase of cardiovascular disease (CVD) hospitalizations in Mashhad, Iran, alongside key environmental factors such as air pollutants, built environment indicators, green spaces, and urban density. Using a spatiotemporal dataset of over 52,000 hospitalized CVD patients collected over five years, the study supports approaches like advanced spatiotemporal modeling, artificial intelligence, and machine learning to predict high-risk CVD areas and guide public health interventions.

Data Description: This dataset includes detailed epidemiologic and geospatial information on CVD hospitalizations in Mashhad, Iran, from January 1, 2016, to December 31, 2020.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!