Representation of multiple sequence alignments of protein families in terms of position-specific scoring matrices (PSSMs) is commonly used in the detection of remote homologues. A PSSM is generated with respect to one of the sequences involved in the multiple sequence alignment as a reference. We have shown recently that the use of multiple PSSMs corresponding to an alignment, with several sequences in the family used as reference, improves the sensitivity of the remote homology detection dramatically. MulPSSM contains PSSMs for a large number of sequence and structural families of protein domains with multiple PSSMs for every family. The approach involves use of a clustering algorithm to identify most distinct sequences corresponding to a family. With each one of the distinct sequences as reference, multiple PSSMs have been generated. The current release of MulPSSM contains approximately 33,000 and approximately 38,000 PSSMs corresponding to 7868 sequence and 2625 structural families. A RPS_BLAST interface allows sequence search against PSSMs of sequence or structural families or both. An analysis interface allows display and convenient navigation of alignments and domain hits. MulPSSM can be accessed at http://crick.mbu.iisc.ernet.in/~mulpssm.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1347406PMC
http://dx.doi.org/10.1093/nar/gkj043DOI Listing

Publication Analysis

Top Keywords

multiple pssms
12
structural families
12
position-specific scoring
8
scoring matrices
8
multiple sequence
8
reference multiple
8
pssms corresponding
8
sequence structural
8
distinct sequences
8
interface allows
8

Similar Publications

Identifying the interactions of the drug-target is central to the cognate areas including drug discovery and drug reposition. Although the high-throughput biotechnologies have made tremendous progress, the indispensable clinical trials remain to be expensive, laborious, and intricate. Therefore, a convenient and reliable computer-aided method has become the focus on inferring drug-target interactions (DTIs).

View Article and Find Full Text PDF

Genome sequencing projects unearth sequences of all the protein sequences encoded in a genome. As the first step, homology detection is employed to obtain clues to structure and function of these proteins. However, high evolutionary divergence between homologous proteins challenges our ability to detect distant relationships.

View Article and Find Full Text PDF

Derivative-free neural network for optimizing the scoring functions associated with dynamic programming of pairwise-profile alignment.

Algorithms Mol Biol

February 2018

1Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8579 Japan.

Background: A profile-comparison method with position-specific scoring matrix (PSSM) is among the most accurate alignment methods. Currently, cosine similarity and correlation coefficients are used as scoring functions of dynamic programming to calculate similarity between PSSMs. However, it is unclear whether these functions are optimal for profile alignment methods.

View Article and Find Full Text PDF

Query-seeded iterative sequence similarity searching improves selectivity 5-20-fold.

Nucleic Acids Res

April 2017

European Bioinformatics Institute, EMBL Outstation, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Iterative similarity search programs, like psiblast, jackhmmer, and psisearch, are much more sensitive than pairwise similarity search methods like blast and ssearch because they build a position specific scoring model (a PSSM or HMM) that captures the pattern of sequence conservation characteristic to a protein family. But models are subject to contamination; once an unrelated sequence has been added to the model, homologs of the unrelated sequence will also produce high scores, and the model can diverge from the original protein family. Examination of alignment errors during psiblast PSSM contamination suggested a simple strategy for dramatically reducing PSSM contamination.

View Article and Find Full Text PDF

Cyclin-dependent kinase 5 (Cdk5) is a proline-directed serine/threonine kinase predominantly active in the nervous system where it regulates several processes such as neuronal migration, cytoskeletal dynamics, axonal guidance, and neurotransmission. We constructed a position specific scoring matrix (PSSM) based on a dataset of sites shown to be phosphorylated both in vivo and in vitro by Cdk5. This dataset was curated manually through an exhaustive search of published experimental data.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!