As an important research field in bioinformatics, protein subcellular location prediction is critical to reveal the protein functions and provide insightful information for disease diagnosis and drug development. Predicting protein subcellular locations remains a challenging task due to the difficulty of finding representative features and robust classifiers. Many feature fusion methods have been widely applied to tackle the above issues. However, they still suffer from accuracy loss due to feature redundancy. Furthermore, multiple protein subcellular locations prediction is more complicated since it is fundamentally a multi-label classification problem. The traditional binary classifiers or even multi-class classifiers cannot achieve satisfactory results. This paper proposes a novel method for protein subcellular location prediction with both single and multiple sites based on deep convolutional neural networks. Specifically, we first obtain the integrated features by simultaneously considering the pseudo amino acid, amino acid index distribution, and physicochemical property. We then adopt deep convolutional neural networks to extract high-dimensional features from the fused feature, removing the redundant preliminary features and gaining better representations of the raw sequences. Moreover, we use the self-attention mechanism and a customized loss function to ensure that the model is more inclined to positive data. In addition, we use random k-label sets to reduce the number of prediction labels. Meanwhile, we employ a hybrid strategy of over-sampling and under-sampling to tackle the data imbalance problem. We compare our model with three representative classification alternatives. The experiment results show that our model achieves the best performance in terms of accuracy, demonstrating the efficacy of the proposed model.

Download full-text PDF

Source
http://dx.doi.org/10.1007/s12539-021-00496-7DOI Listing

Publication Analysis

Top Keywords

protein subcellular
20
subcellular locations
12
deep convolutional
12
convolutional neural
12
neural networks
12
multiple protein
8
locations prediction
8
based deep
8
self-attention mechanism
8
subcellular location
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!