Machine learning-based approach for prediction of ion channels and their subclasses.

J Cell Biochem

Department of Computer Science and Engineering, Kamla Nehru Institute of Technology, Sultanpur, Uttar Pradesh, India.

Published: January 2023

Ion channels are ion-permeable protein pores that are found in all cell lipid membranes. Distinct ion channels play multiple roles in biological processes. Proteomic data is fast accumulating as a result of the fast growth of mass spectrometry and giving us the chance to comprehensively explore ion channel classes along with their subclasses. This paper proposes an eXtreme Gradient Boosting (XGBoost)-based method to estimate the ion channel classes and their subclasses. Here, 12 feature vectors are applied to better characterize protein sequences like amino acid composition, pseudo-amino acid composition, normalized moreau-broto autocorrelation, amphiphilic pseudo-amino acid composition, dipeptide composition, Geary autocorrelation, tripeptide composition, sequence-order-coupling number, composition/transition/distribution, conjoint triad, moran autocorrelation, quasi-sequence-order descriptors. Here, a total of 9920 features are extracted from the protein sequence. The principal component analysis is applied to determine the optimal number of features to optimize the performance. In 10-fold cross-validation the proposed XGBoost based approach with optimal 50 features achieved accuracy of 100%, 98.70%, 98.77%, 97.26%, 87.40%, 97.39%, 98.03%, 96.42%, and F1-Score of 100%, 99%, 99%, 97%, 87%, 97%, 98%, 97%, for prediction of ion channel and nonion channel, voltage-gated and ligand-gated ion channels, subclasses of voltage-gated ion channels (VGICs), subclasses of ligand-gated ion channels (LGICs), subclasses of voltage-gated calcium channels (VGCCs), subclasses of voltage-gated potassium channels (VGKCs), subclasses of voltage-gated sodium channels (VGSCs), and subclasses of voltage-gated chloride channels, respectively. Here the proposed approach also compares with the other approaches such as support vector machine, k-nearest neighbor, Gaussian Naïve Bayes, and random forest and also compares with existing methods such as support vector machine (SVM) with maximum relevance maximum distance with an accuracy of 86.6%, 83.7%, and 85.1%, for ion channels, non-ion channels and overall respectively and SVM with radial basis function kernel-based method with an accuracy of 100%, 97% and 99.9% for ion channels, nonion channels, and overall accuracy, respectively.

Download full-text PDF

Source
http://dx.doi.org/10.1002/jcb.30343DOI Listing

Publication Analysis

Top Keywords

ion channels
32
subclasses voltage-gated
20
channels
14
ion channel
12
acid composition
12
ion
11
subclasses
9
prediction ion
8
channels subclasses
8
channel classes
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!