This protocol describes a stepwise process to identify proteins of interest from a query proteome derived from NGS data. We implemented this protocol on transcriptome to identify proteins involved in secondary metabolite and vitamin biosynthesis and ion transport. This knowledge-driven protocol identifies proteins using an integrated approach involving sensitive sequence search and evolutionary relationships. We make use of functionally important residues (FIR) specific for the query protein family identified through its homologous sequences and literature. We screen protein hits based on the clustering with true homologues through phylogenetic tree reconstruction complemented with the FIR mapping. The protocol was validated for the protein hits through qRT-PCR and transcriptome quantification. Our protocol demonstrated a higher specificity as compared to other methods, particularly in distinguishing cross-family hits. This protocol was effective in transcriptome data analysis of as described in Pasha et al.•Knowledge-driven protocol to identify secondary metabolite synthesizing protein in a highly specific manner.•Use of functionally important residues for screening of true hits.•Beneficial for metabolite pathway reconstruction in any (species, metagenomics) NGS data.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7528181PMC
http://dx.doi.org/10.1016/j.mex.2020.101053DOI Listing

Publication Analysis

Top Keywords

knowledge-driven protocol
8
proteins interest
8
identify proteins
8
ngs data
8
secondary metabolite
8
functionally residues
8
protein hits
8
protocol
7
protocol prediction
4
proteins
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!