Clustered sequence representation for fast homology search.

J Comput Biol

School of Computer Science and Information Technology, RMIT University, Melbourne, Australia.

Published: June 2007

We present a novel approach to managing redundancy in sequence databanks such as GenBank. We store clusters of near-identical sequences as a representative union-sequence and a set of corresponding edits to that sequence. During search, the query is compared to only the union-sequences representing each cluster; cluster members are then only reconstructed and aligned if the union-sequence achieves a sufficiently high score. Using this approach with BLAST results in a 27% reduction in collection size and a corresponding 22% decrease in search time with no significant change in accuracy. We also describe our method for clustering that uses fingerprinting, an approach that has been successfully applied to collections of text and web documents in Information Retrieval. Our clustering approach is ten times faster on the GenBank nonredundant protein database than the fastest existing approach, CD-HIT. We have integrated our approach into FSA-BLAST, our new Open Source version of BLAST (available from http://www.fsa-blast.org/). As a result, FSA-BLAST is twice as fast as NCBI-BLAST with no significant change in accuracy.

Download full-text PDF

Source
http://dx.doi.org/10.1089/cmb.2007.R005DOI Listing

Publication Analysis

Top Keywords

change accuracy
8
approach
6
clustered sequence
4
sequence representation
4
representation fast
4
fast homology
4
homology search
4
search novel
4
novel approach
4
approach managing
4

Similar Publications

Changes in cerebrovascular reactivity as a marker of cognitive impairment risk: a transcranial Doppler study.

J Ultrasound

January 2025

Clinical Unit of Neurology, Department of Medicine, Surgery and Health Sciences, University Hospital and Health Services of Trieste, ASUGI, University of Trieste, Strada di Fiume, 447, 34149, Trieste, Italy.

Introduction: Post-stroke cognitive impairment (PSCI) and dementia affect short- and long-term outcome after stroke and can persist even after recover from a physical handicap. The process underlying PSCI is not yet fully understood. Transcranial Doppler ultrasound (TCD) is a feasible method to investigate cerebrovascular aging or dementia, through the pulsatility index (PI), the cerebrovascular reactivity (e.

View Article and Find Full Text PDF

This study addresses the significant issue of rapid land use and land cover (LULC) changes in Lahore District, which is critical for supporting ecological management and sustainable land-use planning. Understanding these changes is crucial for mitigating adverse environmental impacts and promoting sustainable development. The main goal is to evaluate historical LULC changes from 1994 to 2024 and forecast future trends for 2034 and 2044 utilizing the CA-Markov hybrid model combined with GIS methodologies.

View Article and Find Full Text PDF

Optical techniques, such as functional near-infrared spectroscopy (fNIRS), contain high potential for the development of non-invasive wearable systems for evaluating cerebral vascular condition in aging, due to their portability and ability to monitor real-time changes in cerebral hemodynamics. In this study, thirty-six healthy adults were measured by single channel fNIRS to explore differences between two age groups using machine learning (ML). The subjects, measured during functional magnetic resonance imaging (fMRI) at Oulu University Hospital, were divided into young (age ≤ 32) and elderly (age ≥ 57) groups.

View Article and Find Full Text PDF

This study aims to explore the mechanism behind the influence of stress on gas adsorption by coal during deep mining and improve the accuracy of gas disaster prevention and control. To achieve this aim, thermodynamic analysis was conducted on the process of gas adsorption by loaded coal, and modified thermodynamic model proposed by previous scholars. It is found that stress plays an important role in gas adsorption by coal.

View Article and Find Full Text PDF

Cerebral perfusion correlates with amyloid deposition in patients with mild cognitive impairment due to Alzheimer's disease.

J Prev Alzheimers Dis

February 2025

Department of Neurology, Tianjin Neurological Institute, Tianjin Medical University General Hospital, Tianjin, China, 154 Anshan Road Tianjin 300052, PR China; Department of Neurology, Tianjin Medical University General Hospital Airport Site, Tianjin 300052, PR China. Electronic address:

Background: Changes in cerebral blood flow (CBF) may contribute to the initial stages of the pathophysiological process in patients with Alzheimer's disease (AD). Hypoperfusion has been observed in several brain regions in patients with mild cognitive impairment (MCI). However, the clinical significance of CBF changes in the early stages of AD is currently unclear.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!