muBLASTP: database-indexed protein sequence search on multicore CPUs.

BMC Bioinformatics

Department of Computer Science, Virginia Tech, 225 Stanger Street, Blacksburg, 24060, VA, USA.

Published: November 2016

Background: The Basic Local Alignment Search Tool (BLAST) is a fundamental program in the life sciences that searches databases for sequences that are most similar to a query sequence. Currently, the BLAST algorithm utilizes a query-indexed approach. Although many approaches suggest that sequence search with a database index can achieve much higher throughput (e.g., BLAT, SSAHA, and CAFE), they cannot deliver the same level of sensitivity as the query-indexed BLAST, i.e., NCBI BLAST, or they can only support nucleotide sequence search, e.g., MegaBLAST. Due to different challenges and characteristics between query indexing and database indexing, the existing techniques for query-indexed search cannot be used into database indexed search.

Results: muBLASTP, a novel database-indexed BLAST for protein sequence search, delivers identical hits returned to NCBI BLAST. On Intel Haswell multicore CPUs, for a single query, the single-threaded muBLASTP achieves up to a 4.41-fold speedup for alignment stages, and up to a 1.75-fold end-to-end speedup over single-threaded NCBI BLAST. For a batch of queries, the multithreaded muBLASTP achieves up to a 5.7-fold speedups for alignment stages, and up to a 4.56-fold end-to-end speedup over multithreaded NCBI BLAST.

Conclusions: With a newly designed index structure for protein database and associated optimizations in BLASTP algorithm, we re-factored BLASTP algorithm for modern multicore processors that achieves much higher throughput with acceptable memory footprint for the database index.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5096327PMC
http://dx.doi.org/10.1186/s12859-016-1302-4DOI Listing

Publication Analysis

Top Keywords

sequence search
16
ncbi blast
12
protein sequence
8
multicore cpus
8
search database
8
higher throughput
8
mublastp achieves
8
alignment stages
8
end-to-end speedup
8
blastp algorithm
8

Similar Publications

Background: Ferroptosis and immune responses are critical pathological events in spinal cord injury (SCI), whereas relative molecular and cellular mechanisms remain unclear.

Methods: Micro-array datasets (GSE45006, GSE69334), RNA sequencing (RNA-seq) dataset (GSE151371), spatial transcriptome datasets (GSE214349, GSE184369), and single cell RNA sequencing (scRNA-seq) datasets (GSE162610, GSE226286) were available from the Gene Expression Omnibus (GEO) database. Through weighted gene co-expression network analysis and differential expression analysis in GSE45006, we identified differentially expressed time- and immune-related genes (DETIRGs) associated with chronic SCI and differentially expressed ferroptosis- and immune-related genes (DEFIRGs), which were validated in GSE151371.

View Article and Find Full Text PDF

Osteogenic differentiation is crucial in normal bone formation and pathological calcification, such as calcific aortic valve disease (CAVD). Understanding the proteomic and transcriptomic landscapes underlying this differentiation can unveil potential therapeutic targets for CAVD. In this study, we employed RNA sequencing transcriptomics and proteomics on a timsTOF Pro platform to explore the multiomics profiles of valve interstitial cells (VICs) and osteoblasts during osteogenic differentiation.

View Article and Find Full Text PDF

DisGeNet: a disease-centric interaction database among diseases and various associated genes.

Database (Oxford)

January 2025

School of Computer Science and Technology, Xidian University, 266 Xinglong Section of Xifeng Road, Xi'an, Shaanxi 710126, China.

The pathogenesis of complex diseases is intricately linked to various genes and network medicine has enhanced understanding of diseases. However, most network-based approaches ignore interactions mediated by noncoding RNAs (ncRNAs) and most databases only focus on the association between genes and diseases. Based on the mentioned questions, we have developed DisGeNet, a database focuses not only on the disease-associated genes but also on the interactions among genes.

View Article and Find Full Text PDF

Fungal periprosthetic joint infections (PJIs) are rare but increasingly recognized complications following total joint arthroplasty (TJA). While remains the most common pathogen, non-albicans species and other fungi, such as , have gained prominence. These infections often present with subtle clinical features and affect patients with significant comorbidities or immunosuppression.

View Article and Find Full Text PDF

Channel-Hopping Sequence and Searching Algorithm for Rendezvous of Spectrum Sensing.

Sensors (Basel)

December 2024

Department of Computer Engineering, Ulsan College, Ulsan 44022, Republic of Korea.

In this paper, we propose a method for applying the -ary m-sequence as a channel-searching pattern for rendezvous in the asymmetric channel model of cognitive radio. We mathematically analyzed and calculated the ETTR when the m-sequence is applied to the conventional scheme, and our simulation results demonstrated that the ETTR performance is significantly better than that of the JS algorithm. Furthermore, we introduced a new channel-searching scheme that maximizes the benefits of the m-sequence and proposed a method to adapt the generation of the m-sequence for use in the newly proposed scheme.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!