Prediction of 2-hydroxyisobutyrylation sites by integrating multiple sequence features with ensemble support vector machine.

Comput Biol Chem

College of Science, Shenyang Aerospace University, 110136, People's Republic of China. Electronic address:

Published: May 2020

Lysine 2-hydroxyisobutyrylation (K) is a new type of histone mark, which has been found to affect the association between histone and DNA. To better understand the molecular mechanism of K, it is important to identify 2-hydroxyisobutyrylated substrates and their corresponding K sites accurately. In this study, a novel bioinformatics tool named KhibPred is proposed to predict K sites in human HeLa cells. Three kinds of effective features, the composition of k-spaced amino acid pairs, binary encoding and amino acid factors, are incorporated to encode K sites. Moreover, an ensemble support vector machine is employed to overcome the imbalanced problem in the prediction. As illustrated by 10-fold cross-validation, the performance of KhibPred achieves a satisfactory performance with an area under receiver operating characteristic curve of 0.7937. Therefore, KhibPred can be a useful tool for predicting protein K sites. Feature analysis shows that the polarity factor features play significant roles in the prediction of K sites. The conclusions derived from this study might provide useful insights for in-depth investigation into the molecular mechanisms of K.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.compbiolchem.2020.107280DOI Listing

Publication Analysis

Top Keywords

ensemble support
8
support vector
8
vector machine
8
amino acid
8
sites
6
prediction 2-hydroxyisobutyrylation
4
2-hydroxyisobutyrylation sites
4
sites integrating
4
integrating multiple
4
multiple sequence
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!