Island method for estimating the statistical significance of profile-profile alignment scores.

BMC Bioinformatics

Department of Computer Science, University of Northern Iowa, Cedar Falls, IA 50614, USA.

Published: April 2009

Background: In the last decade, a significant improvement in detecting remote similarity between protein sequences has been made by utilizing alignment profiles in place of amino-acid strings. Unfortunately, no analytical theory is available for estimating the significance of a gapped alignment of two profiles. Many experiments suggest that the distribution of local profile-profile alignment scores is of the Gumbel form. However, estimating distribution parameters by random simulations turns out to be computationally very expensive.

Results: We demonstrate that the background distribution of profile-profile alignment scores heavily depends on profiles' composition and thus the distribution parameters must be estimated independently, for each pair of profiles of interest. We also show that accurate estimates of statistical parameters can be obtained using the "island statistics" for profile-profile alignments.

Conclusion: The island statistics can be generalized to profile-profile alignments to provide an efficient method for the alignment score normalization. Since multiple island scores can be extracted from a single comparison of two profiles, the island method has a clear speed advantage over the direct shuffling method for comparable accuracy in parameter estimates.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2678096PMC
http://dx.doi.org/10.1186/1471-2105-10-112DOI Listing

Publication Analysis

Top Keywords

profile-profile alignment
12
alignment scores
12
island method
8
alignment profiles
8
distribution parameters
8
alignment
6
profile-profile
5
island
4
method estimating
4
estimating statistical
4

Similar Publications

Context: Leishmaniasis is a group of vector-borne infectious diseases caused by over 20 pathogenic Leishmania species that are endemic in many tropical and subtropical countries. The emergence of drug-resistant strains, the adverse side effects of anti-Leishmania drugs, and the absence of a preventative vaccination strategy threaten the sensitive population. Recently, many groups of researchers have exploited the field of reverse vaccinology to develop vaccines, focusing chiefly on inducing immunity against either visceral or cutaneous leishmaniasis.

View Article and Find Full Text PDF

WMSA: a novel method for multiple sequence alignment of DNA sequences.

Bioinformatics

November 2022

School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China.

Motivation: Multiple sequence alignment (MSA) is a fundamental problem in bioinformatics. The quality of alignment will affect downstream analysis. MAFFT has adopted the Fast Fourier Transform method for searching the homologous segments and using them as anchors to divide the sequences, then making alignment only on segments, which can save time and memory without overly reducing the sequence alignment quality.

View Article and Find Full Text PDF

Application of cryo-electron microscopy (cryo-EM) is crucially important for ascertaining the atomic structure of large biomolecules such as ribosomes and protein complexes in membranes. Advances in cryo-EM technology and software have made it possible to obtain data with near-atomic resolution, but the method is still often capable of producing only a density map with up to medium resolution, either partially or entirely. Therefore, bridging the gap separating the density map and the atomic model is necessary.

View Article and Find Full Text PDF

Homology Modeling Using GPCRM Web Service.

Methods Mol Biol

July 2021

Faculty of Chemistry, Biological and Chemical Research Centre, University of Warsaw, Warsaw, Poland.

Homology modeling methods are commonly used for quick and precise construction of a desired protein or its mutant using protein templates, which were determined by crystallography, cryo-EM, or NMR. Due to the increasing number of such structures, the obtained models are precise even in the case of small similarity between sequences of template and modeled proteins. The reason for that is a high evolutionary conservation in the structure regions responsible for keeping the function of proteins.

View Article and Find Full Text PDF

FAM161A is a microtubule-associated protein conserved widely across eukaryotes, which is mutated in the inherited blinding disease Retinitis Pigmentosa-28. FAM161A is also a centrosomal protein, being a core component of a complex that forms an internal skeleton of centrioles. Despite these observations about the importance of FAM161A, current techniques used to examine its sequence reveal no homologies to other proteins.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!