GenDiS database update with improved approach and features to recognize homologous sequences of protein domain superfamilies.

Database (Oxford)

National Centre for Biological Sciences, Tata Institute of Fundamental Research (TIFR), Gandhi Krishi, Vignana Kendra Campus, Bellary Road, Bangalore, Karnataka, India.

Published: January 2019

Since proteins evolve by divergent evolution, proteins with distant homology to each other may or may not bear similar functions. Improved computational approaches are required to recognize distant homologues that are functionally similar. One of the methods of assigning function to sequences is to use profiles derived from sequences of known structure. We describe an update of the Genomic Distribution of protein structural domain Superfamilies (GenDiS) database, namely GenDiS+, which provides a projection of SCOP superfamily members on the sequence space (NR database, NCBI). The sequences are validated using structure-based sequence alignment profiles and domain and full-length sequence alignments. GenDiS+ is a `tour de force' for detecting homologues within around 160 000 taxonomic identifiers, starting from nearly 11 000 domains of known structure. Features, like full-sequence alignment and phylogeny, domain sequence alignment and phylogeny, list of associated structural and sequence domains with strength of interactions, links to databases like Pfam, UniProt and ModBase and list of sequences with a PDB structure, are provided.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6446967PMC
http://dx.doi.org/10.1093/database/baz042DOI Listing

Publication Analysis

Top Keywords

gendis database
8
domain superfamilies
8
sequence alignment
8
alignment phylogeny
8
sequences
5
sequence
5
database update
4
update improved
4
improved approach
4
approach features
4

Similar Publications

Background The purpose of this study was to investigate the patterns in allegations of workplace discrimination filed by individuals with cancer. Objective The goal of the research was to describe discrimination, both actual and perceived, that has occurred against individuals with cancer in comparison to a group of individuals with other general disabilities (GENDIS) through analyses of the United States Equal Employment Opportunity Commission (EEOC) Integrated Mission System (IMS) database. Methods An ex post facto, causal comparative quantitative design was used to examine Americans with Disabilities Act Amendments Act (ADAAA) Title I complaints received by the EEOC from people with cancer from 2009 through 2016 (n = 10,848) in comparison to the GENDIS group over the same time period (n = 13,338).

View Article and Find Full Text PDF

Since proteins evolve by divergent evolution, proteins with distant homology to each other may or may not bear similar functions. Improved computational approaches are required to recognize distant homologues that are functionally similar. One of the methods of assigning function to sequences is to use profiles derived from sequences of known structure.

View Article and Find Full Text PDF

Background: Odorant binding proteins (OBPs) are believed to shuttle odorants from the environment to the underlying odorant receptors, for which they could potentially serve as odorant presenters. Although several sequence based search methods have been exploited for protein family prediction, less effort has been devoted to the prediction of OBPs from sequence data and this area is more challenging due to poor sequence identity between these proteins.

Results: In this paper, we propose a new algorithm that uses Regularized Least Squares Classifier (RLSC) in conjunction with multiple physicochemical properties of amino acids to predict odorant-binding proteins.

View Article and Find Full Text PDF

GenDiS: Genomic Distribution of protein structural domain Superfamilies.

Nucleic Acids Res

January 2005

National Centre for Biological Sciences, Tata Institute of Fundamental Research, UAS-GKVK campus, Bellary Road, Bangalore 560 065, Karnataka, India.

Several proteins that have substantially diverged during evolution retain similar three-dimensional structures and biological function inspite of poor sequence identity. The database on Genomic Distribution of protein structural domain Superfamilies (GenDiS) provides record for the distribution of 4001 protein domains organized as 1194 structural superfamilies across 18,997 genomes at various levels of hierarchy in taxonomy. GenDiS database provides a survey of protein domains enlisted in sequence databases employing a 3-fold sequence search approach.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!