Classification and annotation of enzyme proteins are fundamental for enzyme research on biological metabolism. Enzyme Commission (EC) numbers provide a standard for hierarchical enzyme class prediction, on which several computational methods have been proposed. However, most of these methods are dependent on prior distribution information and none explicitly quantifies amino-acid-level relations and possible contribution of sub-sequences. In this study, we propose a double-scale attention enzyme class prediction model named DAttProt with high reusability and interpretability. DAttProt encodes sequence by self-supervised Transformer encoders in pre-training and gathers local features by multi-scale convolutions in fine-tuning. Specially, a probabilistic double-scale attention weight matrix is designed to aggregate multi-scale features and positional prediction scores. Finally, a full connection linear classifier conducts a final inference through the aggregated features and prediction scores. On DEEPre and ECPred datasets, DAttProt performs as competitive with the compared methods on level 0 and outperforms them on deeper task levels, reaching 0.788 accuracy on level 2 of DEEPre and 0.967 macro- on level 1 of ECPred. Moreover, through case study, we demonstrate that the double-scale attention matrix learns to discover and focus on the positions and scales of bio-functional sub-sequences in the protein. Our DAttProt provides an effective and interpretable method for enzyme class prediction. It can predict enzyme protein classes accurately and furthermore discover enzymatic functional sub-sequences such as protein motifs from both positional and spatial scales.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9012241PMC
http://dx.doi.org/10.3389/fgene.2022.885627DOI Listing

Publication Analysis

Top Keywords

double-scale attention
16
class prediction
16
enzyme class
12
enzyme
8
enzyme protein
8
transformer encoders
8
multi-scale convolutions
8
prediction scores
8
sub-sequences protein
8
prediction
6

Similar Publications

[Lung parenchyma segmentation based on double scale parallel attention network].

Sheng Wu Yi Xue Gong Cheng Xue Za Zhi

August 2022

School of Electronic Information Engineering, Hebei University, Baoding, Hebei 071002, P. R. China.

[]Automatic and accurate segmentation of lung parenchyma is essential for assisted diagnosis of lung cancer. In recent years, researchers in the field of deep learning have proposed a number of improved lung parenchyma segmentation methods based on U-Net. However, the existing segmentation methods ignore the complementary fusion of semantic information in the feature map between different layers and fail to distinguish the importance of different spaces and channels in the feature map.

View Article and Find Full Text PDF

Classification and annotation of enzyme proteins are fundamental for enzyme research on biological metabolism. Enzyme Commission (EC) numbers provide a standard for hierarchical enzyme class prediction, on which several computational methods have been proposed. However, most of these methods are dependent on prior distribution information and none explicitly quantifies amino-acid-level relations and possible contribution of sub-sequences.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!