Query-seeded iterative sequence similarity searching improves selectivity 5-20-fold.

Nucleic Acids Res

European Bioinformatics Institute, EMBL Outstation, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Published: April 2017

Iterative similarity search programs, like psiblast, jackhmmer, and psisearch, are much more sensitive than pairwise similarity search methods like blast and ssearch because they build a position specific scoring model (a PSSM or HMM) that captures the pattern of sequence conservation characteristic to a protein family. But models are subject to contamination; once an unrelated sequence has been added to the model, homologs of the unrelated sequence will also produce high scores, and the model can diverge from the original protein family. Examination of alignment errors during psiblast PSSM contamination suggested a simple strategy for dramatically reducing PSSM contamination. psiblast PSSMs are built from the query-based multiple sequence alignment (MSA) implied by the pairwise alignments between the query model (PSSM, HMM) and the subject sequences in the library. When the original query sequence residues are inserted into gapped positions in the aligned subject sequence, the resulting PSSM rarely produces alignment over-extensions or alignments to unrelated sequences. This simple step, which tends to anchor the PSSM to the original query sequence and slightly increase target percent identity, can reduce the frequency of false-positive alignments more than 20-fold compared with psiblast and jackhmmer, with little loss in search sensitivity.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5605230PMC
http://dx.doi.org/10.1093/nar/gkw1207DOI Listing

Publication Analysis

Top Keywords

sequence
8
similarity search
8
psiblast jackhmmer
8
model pssm
8
pssm hmm
8
protein family
8
unrelated sequence
8
pssm contamination
8
original query
8
query sequence
8

Similar Publications

β-tubulin isotypes exhibit similar sequences but different activities, suggesting that limited sequence divergence is functionally important. We investigated this hypothesis for TUBB3/β3, a β-tubulin linked to aggressive cancers and chemoresistance in humans. We created mutant yeast strains with β-tubulin alleles that mimic variant residues in β3 and find that residues at the lateral interface are sufficient to alter microtubule dynamics and response to microtubule targeting agents.

View Article and Find Full Text PDF

Importance: Treatment to lower high levels of low-density lipoprotein cholesterol (LDL-C) reduces incident coronary artery disease (CAD) risk but modestly increases the risk for incident type 2 diabetes (T2D). The extent to which genetic factors across the cholesterol spectrum are associated with incident T2D is not well understood.

Objective: To investigate the association of genetic predisposition to increased LDL-C levels with incident T2D risk.

View Article and Find Full Text PDF

Filamentous plant pathogens pose a severe threat to food security. Current estimates suggest up to 23% yield losses to pre- and post-harvest diseases and these losses are projected to increase due to climate change (Singh et al. 2023; Chaloner et al.

View Article and Find Full Text PDF

Background: The metabolism of stearoyl-GPE plays a key role in the liver metastasis of gastric cancer. This investigation delves into the mechanisms underlying the intricate tumor microenvironment (TME) heterogeneity triggered by stearoyl metabolism in gastric cancer with liver metastasis (LMGC), offering novel perspectives for LMGC.

Objective: Utilizing Mendelian randomization, we determined that stearoyl metabolism significantly contributes to the progression of gastric cancer (GC).

View Article and Find Full Text PDF

Ion Pair Chromatography for Endogenous Metabolite LC-MS Analysis in Tissue Samples Following HGH Resolution Untargeted Acquisition.

Methods Mol Biol

January 2025

Bioscience, Research and Early Development, Oncology, AstraZeneca, Cambridge, Cambridgeshire, UK.

A protocol for the preparation of tissue extracts for the targeted analysis ca. 150 polar metabolites, including those involved in central carbon metabolism, is described, using a reversed phase ion pair U(H)PLC-MS method. Data collection enabled in high-resolution mass spectrometry detection provides highly specific and sensitive acquisition of metabolic intermediates with wide range physicochemical properties and pathway coverage.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!