Identification of efflux proteins based on contextual representations with deep bidirectional transformer encoders.

Anal Biochem

Department of Computer Science & Engineering, Yuan Ze University, Chungli, 32003, Taiwan. Electronic address:

Published: November 2021

Efflux proteins are the transport proteins expressed in the plasma membrane, which are involved in the movement of unwanted toxic substances through specific efflux pumps. Several studies based on computational approaches have been proposed to predict transport proteins and thereby to understand the mechanism of the movement of ions across cell membranes. However, few methods were developed to identify efflux proteins. This paper presents an approach based on the contextualized word embeddings from Bidirectional Encoder Representations from Transformers (BERT) with the Support Vector Machine (SVM) classifier. BERT is the most effective pre-trained language model that performs exceptionally well on several Natural Language Processing (NLP) tasks. Therefore, the contextualized representations from BERT were implemented to incorporate multiple interpretations of identical amino acids in the sequence. A dataset of efflux proteins with annotations was first established. The feature vectors were extracted by transferring protein data through the hidden layers of the pre-trained model. Our proposed method was trained on complete training datasets to identify efflux proteins and achieved the accuracies of 94.15% and 87.13% in the independent tests on membrane and transport datasets, respectively. This study opens a research avenue for the implementation of contextualized word embeddings in Bioinformatics and Computational Biology.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ab.2021.114416DOI Listing

Publication Analysis

Top Keywords

efflux proteins
20
transport proteins
8
identify efflux
8
contextualized word
8
word embeddings
8
proteins
7
efflux
5
identification efflux
4
proteins based
4
based contextual
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!