Prediction of virus-host protein-protein interactions (PPI) is a broad research area where various machine-learning-based classifiers are developed. Transforming biological data into machine-usable features is a preliminary step in constructing these virus-host PPI prediction tools. In this study, we have adopted a virus-host PPI dataset and a reduced amino acids alphabet to create tripeptide features and introduced a correlation coefficient-based feature selection. We applied feature selection across several correlation coefficient metrics and statistically tested their relevance in a structural context. We compared the performance of feature-selection models against that of the baseline virus-host PPI prediction models created using different classification algorithms without the feature selection. We also tested the performance of these baseline models against the previously available tools to ensure their predictive power is acceptable. Here, the Pearson coefficient provides the best performance with respect to the baseline model as measured by AUPR; a drop of 0.003 in AUPR while achieving a 73.3% (from 686 to 183) reduction in the number of tripeptides features for random forest. The results suggest our correlation coefficient-based feature selection approach, while decreasing the computation time and space complexity, has a limited impact on the prediction performance of virus-host PPI prediction tools.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10153705PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0285168PLOS

Publication Analysis

Top Keywords

feature selection
20
virus-host ppi
16
correlation coefficient-based
12
coefficient-based feature
12
ppi prediction
12
selection approach
8
virus-host protein-protein
8
prediction tools
8
virus-host
6
prediction
6

Similar Publications

Background: Sepsis is a severe complication in leukemia patients, contributing to high mortality rates. Identifying early predictors of sepsis is crucial for timely intervention. This study aimed to develop and validate a predictive model for sepsis risk in leukemia patients using machine learning techniques.

View Article and Find Full Text PDF

Background: Multidrug-resistant Klebsiella pneumoniae (MDR-KP) infections pose a significant global healthcare challenge, particularly due to the high mortality risk associated with septic shock. This study aimed to develop and validate a machine learning-based model to predict the risk of MDR-KP-associated septic shock, enabling early risk stratification and targeted interventions.

Methods: A retrospective analysis was conducted on 1,385 patients with MDR-KP infections admitted between January 2019 and June 2024.

View Article and Find Full Text PDF

A novel machine learning based framework for developing composite digital biomarkers of disease progression.

Front Digit Health

January 2025

Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, NJ, United States.

Background: Current methods of measuring disease progression of neurodegenerative disorders, including Parkinson's disease (PD), largely rely on composite clinical rating scales, which are prone to subjective biases and lack the sensitivity to detect progression signals in a timely manner. Digital health technology (DHT)-derived measures offer potential solutions to provide objective, precise, and sensitive measures that address these limitations. However, the complexity of DHT datasets and the potential to derive numerous digital features that were not previously possible to measure pose challenges, including in selection of the most important digital features and construction of composite digital biomarkers.

View Article and Find Full Text PDF

Hippocampal Functional Radiomic Features for Identification of the Cognitively Impaired Patients from Low-Back-Related Pain: A Prospective Machine Learning Study.

J Pain Res

January 2025

Jiangxi Provincial Key Laboratory for Precision Pathology and Intelligent Diagnosis, Department of Radiology, the First Affiliated Hospital, Jiangxi Medical College, Nanchang University, Nanchang, 330006, People's Republic of China.

Purpose: To investigate whether functional radiomic features in bilateral hippocampi can identify the cognitively impaired patients from low-back-related leg pain (LBLP).

Patients And Methods: For this retrospective study, a total of 95 clinically definite LBLP patients (40 cognitively impaired patients and 45 cognitively preserved patients) were included, and all patients underwent functional MRI and clinical assessments. After calculating the amplitude of low-frequency fluctuations (ALFF), regional homogeneity (ReHo), voxel-mirrored homotopic connectivity (VMHC) and degree centrality (DC) imaging, the radiomic features (n = 819) of bilateral hippocampi were extracted from these images, respectively.

View Article and Find Full Text PDF

Introduction: As a hallmark feature of amyotrophic lateral sclerosis (ALS), bulbar involvement significantly impacts psychosocial, emotional, and physical health. A validated objective marker is however lacking to characterize and phenotype bulbar involvement, positing a major barrier to early detection, progress monitoring, and tailored care. This study aimed to bridge this gap by constructing a multiplex functional mandibular muscle network to provide a novel objective measurement tool of bulbar involvement.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!