Protein-protein interactions (PPIs) play a crucial role in biological processes of living organisms. Correct prediction of PPI can prove to be extremely valuable in probing protein functions which can aid in the development of new and powerful therapies for disease prevention. Many experimental studies have been previously performed to investigate PPIs. However, in-vitro techniques to investigate PPIs are resource-extensive and time-consuming. Although various in-silico approaches to predict PPI have been developed in recent years, they could be fallible in terms of accuracy and false-positive rate. To overcome these shortcomings, we propose a novel approach, AE-LGBM to predict the PPIs more accurately. It employs LightGBM classifier and utilizes the Autoencoder, which is an artificial neural network, to efficiently produce lower-dimensional, discriminative, and noise-free features. We incorporate conjoint triad (CT) and Composition-Transition-Distribution (CTD) features into the AE-LGBM framework. On performing ten-fold cross-validation, the prediction accuracies obtained by AE-LGBM for Human and Yeast datasets are 98.7% and 95.4% respectively. AE-LGBM was further evaluated on independent datasets and has achieved excellent accuracies of 100%, 100%, 99.9%, 99.3%, 99.2% on E. coli, M. musculus, C. elegans, H. pylori and H. sapiens respectively. AE-LGBM has also obtained the best accuracy when tested over three important PPI networks namely single-core network (CD9), the multiple-core network (The Ras/Raf/MEK/ERK pathway) and the cross-connection network (Wnt Network). The outstanding generalization ability of AE-LGBM makes it a versatile, efficient, and robust PPIs prediction model.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.compbiomed.2020.103964 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!