The world was ambushed in 2019 by the COVID-19 virus which affected the health, economy, and lifestyle of individuals worldwide. One way of combating such a public health concern is by using appropriate, rapid, and unbiased diagnostic tools for quick detection of infected people. However, a current dearth of bioinformatics tools necessitates modeling studies to help diagnose COVID-19 cases. Molecular-based methods such as the real-time reverse transcription polymerase chain reaction (rRT-PCR) for detecting COVID-19 is time consuming and prone to contamination. Modern bioinformatics tools have made it possible to create large databases of protein sequences of various diseases, apply data mining techniques, and accurately diagnose diseases. However, the current sequence alignment tools that use these databases are not able to detect novel COVID-19 viral sequences due to high sequence dissimilarity. The objective of this study, therefore, was to develop models that can accurately classify COVID-19 viral sequences rapidly using protein vectors generated by neural word embedding technique. Five machine learning models; K nearest neighbor regression (KNN), support vector machine (SVM), random forest (RF), Linear discriminant analysis (LDA), and Logistic regression were developed using datasets from the National Center for Biotechnology. Our results suggest, the RF model performed better than all other models on the training dataset with 99% accuracy score and 99.5% accuracy on the testing dataset. The implication of this study is that, rapid detection of the COVID-19 virus in suspected cases could potentially save lives as less time will be needed to ascertain the status of a patient.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9119569PMC
http://dx.doi.org/10.1007/s41870-022-00949-2DOI Listing

Publication Analysis

Top Keywords

protein vectors
8
machine learning
8
covid-19 virus
8
bioinformatics tools
8
covid-19 viral
8
viral sequences
8
covid-19
7
word2vec neural
4
neural model-based
4
model-based technique
4

Similar Publications

Background: Congenital factor VII (FVII) deficiency is a genetic disorder characterized by decreased FVII activity, which sometimes leads to fatal bleeding. Numerous variants have been found in FVII deficiency, but mutations vary among patients. Each mutation deserves further exploration for each patient at risk of bleeding.

View Article and Find Full Text PDF

Vaccinia growth factor-dependent modulation of the mTORC1-CAD axis upon nutrient restriction.

J Virol

January 2025

Department of Veterinary Pathobiology, College of Veterinary Medicine & Biomedical Sciences, Texas A&M University, College Station, Texas, USA.

The molecular mechanisms by which vaccinia virus (VACV), the prototypical member of the poxviridae family, reprograms host cell metabolism remain largely unexplored. Additionally, cells sense and respond to fluctuating nutrient availability, thereby modulating metabolic pathways to ensure cellular homeostasis. Understanding how VACV modulates metabolic pathways in response to nutrient signals is crucial for understanding viral replication mechanisms, with the potential for developing antiviral therapies.

View Article and Find Full Text PDF

Reticulocyte Binding Protein Homologue (RH5), a leading malaria vaccine candidate, is essential for erythrocyte invasion by the parasite, interacting with the human host receptor, basigin. RH5 has a small number of polymorphisms relative to other blood-stage antigens, and studies have shown that vaccine-induced antibodies raised against RH5 are strain-transcending, however most studies investigating RH5 diversity have been done in Africa. Understanding the genetic diversity and evolution of malaria antigens in other regions is important for their validation as vaccine candidates.

View Article and Find Full Text PDF

Adenovirus-based therapies have encountered significant challenges due to host immunity, particularly from pre-existing antibodies. Many trials have struggled to evade antibody response; however, the efficiency of these efforts was limited by the diversity of antibody Fv-region recognizing multiple amino acid sequences. In this study, we developed an antibody-evading adenovirus vector by encoding a plasma-rich protein transferrin-binding domain.

View Article and Find Full Text PDF

Expression, purification and immunogenicity analyses of receptor binding domain protein of severe acute respiratory syndrome coronavirus 2 from delta variant.

Vet Res Forum

December 2024

Institute of Pathogenic Microbiology, College of Biological Science and Engineering, and Nanchang Key Laboratory of Animal Virus and Genetic Engineering, Jiangxi Agricultural University, Nanchang, China.

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is responsible for the COVID-19 pandemic. The receptor binding domain (RBD), located at the spike protein of SARS-CoV-2, contains most of the neutralizing epitopes during viral infection and is an ideal antigen for vaccine development. In this study, bioinformatic analysis of the amino acid sequence data of SARS-CoV-2 RBD protein for the better understanding of molecular characteristics was performed.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!