Background: Coronavirus disease 2019 (COVID-19), caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), emerged in Wuhan, China, in late 2019 and created a global pandemic that overwhelmed healthcare systems. COVID-19, as of July 3, 2021, yielded 182 million confirmed cases and 3.9 million deaths globally according to the World Health Organization. Several patients who were initially diagnosed with mild or moderate COVID-19 later deteriorated and were reclassified to severe disease type.

Objective: The aim is to create a predictive model for COVID-19 ventilatory support and mortality early on from baseline (at the time of diagnosis) and routinely collected data of each patient (CXR, CBC, demographics, and patient history).

Methods: Four common machine learning algorithms, three data balancing techniques, and feature selection are used to build and validate predictive models for COVID-19 mechanical requirement and mortality. Baseline CXR, CBC, demographic, and clinical data were retrospectively collected from April 2, 2020, till June 18, 2020, for 5739 patients with confirmed PCR COVID-19 at King Abdulaziz Medical City in Riyadh. However, of those patients, only 1508 and 1513 have met the inclusion criteria for ventilatory support and mortalilty endpoints, respectively.

Results: In an independent test set, ventilation requirement predictive model with top 20 features selected with reliefF algorithm from baseline radiological, laboratory, and clinical data using support vector machines and random undersampling technique attained an AUC of 0.87 and a balanced accuracy of 0.81. For mortality endpoint, the top model yielded an AUC of 0.83 and a balanced accuracy of 0.80 using all features with balanced random forest. This indicates that with only routinely collected data our models can predict the outcome with good performance. The predictive ability of combined data consistently outperformed each data set individually for intubation and mortality. For the ventilator support, chest X-ray severity annotations alone performed better than comorbidity, complete blood count, age, or gender with an AUC of 0.85 and balanced accuracy of 0.79. For mortality, comorbidity alone achieved an AUC of 0.80 and a balanced accuracy of 0.72, which is higher than models that use either chest radiograph, laboratory, or demographic features only.

Conclusion: The experimental results demonstrate the practicality of the proposed COVID-19 predictive tool for hospital resource planning and patients' prioritization in the current COVID-19 pandemic crisis.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8331117PMC
http://dx.doi.org/10.2147/JMDH.S322431DOI Listing

Publication Analysis

Top Keywords

balanced accuracy
16
routinely collected
12
clinical data
12
covid-19
9
ventilation requirement
8
requirement mortality
8
laboratory clinical
8
data
8
machine learning
8
predictive model
8

Similar Publications

Background: Lung adenocarcinoma patients are often found to have developed bone metastases at the time of initial diagnosis. With the continuous development of technology, we have successfully entered the era of immunotherapy. This study aimed to determine the efficacy of immunotherapy in lung adenocarcinoma patients with bone metastases (LABM) through a multicenter retrospective analysis and to develop a novel tool to identify the population that could benefit most from immunotherapy.

View Article and Find Full Text PDF

A stable combination of non-stable genes outperforms standard reference genes for RT-qPCR data normalization.

Sci Rep

December 2024

Laboratoire de Recherche en Sciences Végétales, Equipe Génomique et Biotechnologie des Fruits, UMR 5546, CNRS, UPS, Toulouse INP, Université de Toulouse, Toulouse, France.

Gene expression profiling is of key importance in all domains of life sciences, as medicine, environment, and plants, for both basic and applied research. Despite the emergence of microarrays and high-throughput sequencing, qPCR remains a standard method for gene expression analyses, with its data normalization step being crucial for ensuring accuracy. Currently, the most widely used normalization method is based on the use of reference genes, assumed to be stably expressed across all experimental conditions.

View Article and Find Full Text PDF

Predicting host health status through an integrated machine learning framework: insights from healthy gut microbiome aging trajectory.

Sci Rep

December 2024

State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, National Medical Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China.

The gut microbiome, recognized as a critical component in the development of chronic diseases and aging processes, constitutes a promising approach for predicting host health status. Previous research has underscored the potential of microbiome-based predictions, and the rapid advancements of machine learning techniques have introduced new opportunities for exploiting microbiome data. To predict various host nonhealthy conditions, this study proposed an integrated machine learning-based estimation pipeline of Gut Age Index (GAI) by establishing a health aging baseline with the gut microbiome data from healthy individuals.

View Article and Find Full Text PDF

Background: Pneumonia is a potentially life-threatening respiratory tract infection. Many Early Warning Scores (EWS) were developed to detect patients with high risk for adverse clinical outcomes, but few have explored the utility of these EWS for pneumonia patients in the Emergency Department (ED) setting. We aimed to compare the prognostic utility of A-DROP, NEWS2, and REMS in predicting in-hospital mortality and the requirement for mechanical ventilation among ED patients with pneumonia.

View Article and Find Full Text PDF

The traditional machine learning methods such as decision tree (DT), random forest (RF), and support vector machine (SVM) have low classification performance. This paper proposes an algorithm for the dry bean dataset and obesity levels dataset that can balance the minority class and the majority class and has a clustering function to improve the traditional machine learning classification accuracy and various performance indicators such as precision, recall, f1-score, and area under curve (AUC) for imbalanced data. The key idea is to use the advantages of borderline-synthetic minority oversampling technique (BLSMOTE) to generate new samples using samples on the boundary of minority class samples to reduce the impact of noise on model building, and the advantages of K-means clustering to divide data into different groups according to similarities or common features.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!