A new risk assessment model of venous thromboembolism by considering fuzzy population.

BMC Med Inform Decis Mak

Department of Respiration, Peking Union Medical College Hospital, No.1, Shuaifuyuan, Dongcheng District, Beijing, 100730, China.

Published: December 2024

Background: Inpatients with high risk of venous thromboembolism (VTE) usually face serious threats to their health and economic conditions. Many studies using machine learning (ML) models to predict VTE risk overlook the impact of class-imbalance problem due to the low incidence rate of VTE, resulting in inferior and unstable model performance, which hinders their ability to replace the Padua model, a widely used linear weighted model in clinic. Our study aims to develop a new VTE risk assessment model suitable for Chinese medical inpatients.

Methods: 3284 inpatients in the medical department of Peking Union Medical College Hospital (PUMCH) from January 2014 to June 2016 were collected. The training and test set were divided based on the admission time and inpatients from May 2016 to June 2016 were included as the test dataset. We explained the class imbalance problem from a clinical perspective and defined a new term, "fuzzy population", to elaborate and model this phenomenon. By considering the "fuzzy population", a new ML VTE risk assessment model was built through population splitting. Sensitivity and specificity of our method was compared with five ML models (support vector machine (SVM), random forest (RF), gradient boosting decision tree (GBDT), logistic regression (LR), and XGBoost) and the Padua model.

Results: The 'fuzzy population' phenomenon was explained and verified on the VTE dataset. The proposed model achieved higher specificity (64.94% vs. 63.30%) and the same sensitivity (90.24% vs. 90.24%) on test data than the Padua model. Other five ML models couldn't simultaneously surpass the Padua's sensitivity and specificity. Besides, our model was more robust than five ML models and its standard deviations of sensitivities and specificities were smaller. Adjusting the distribution of negative samples in the training set based on the 'fuzzy population' would exacerbate the instability of performance of five ML models, which limited the application of ML methods in clinic.

Conclusions: The proposed model achieved higher sensitivity and specificity than the Padua model, and better robustness than traditional ML models. This study built a population-split-based ML model of VTE by modeling the class-imbalance problem and it can be applied more broadly in risk assessment of other diseases.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11686901PMC
http://dx.doi.org/10.1186/s12911-024-02834-3DOI Listing

Publication Analysis

Top Keywords

risk assessment
16
model
13
assessment model
12
vte risk
12
padua model
12
sensitivity specificity
12
venous thromboembolism
8
class-imbalance problem
8
june 2016
8
"fuzzy population"
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!