BLAST-QC: automated analysis of BLAST results.

Environ Microbiome

Department of Environmental Health Sciences, University of South Carolina, 921 Assembly Street, Columbia, SC, 29208, USA.

Published: August 2020

Background: The Basic Local Alignment Search Tool (BLAST) from NCBI is the preferred utility for sequence alignment and identification for bioinformatics and genomics research. Among researchers using NCBI's BLAST software, it is well known that analyzing the results of a large BLAST search can be tedious and time-consuming. Furthermore, with the recent discussions over the effects of parameters such as '-max_target_seqs' on the BLAST heuristic search process, the use of these search options are questionable. This leaves using a stand-alone parser as one of the only options of condensing these large datasets, and with few available for download online, the task is left to the researcher to create a specialized piece of software anytime they need to analyze BLAST results. The need for a streamlined and fast script that solves these issues and can be easily implemented into a variety of bioinformatics and genomics workflows was the initial motivation for developing this software.

Results: In this study, we demonstrate the effectiveness of BLAST-QC for analysis of BLAST results and its desirability over the other available options. Applying genetic sequence data from our bioinformatic workflows, we establish BLAST_QC's superior runtime when compared to existing parsers developed with commonly used BioPerl and BioPython modules, as well as C and Java implementations of the BLAST_QC program. We discuss the 'max_target_seqs' parameter, the usage of and controversy around the use of the parameter, and offer a solution by demonstrating the ability of our software to provide the functionality this parameter was assumed to produce, as well as a variety of other parsing options. Executions of the script on example datasets are given, demonstrating the implemented functionality and providing test-cases of the program. BLAST-QC is designed to be integrated into existing software, and we establish its effectiveness as a module of workflows or other processes.

Conclusions: BLAST-QC provides the community with a simple, lightweight and portable Python script that allows for easy quality control of BLAST results while avoiding the drawbacks of other options. This includes the uncertain results of applying the -max_target_seqs parameter or relying on the cumbersome dependencies of other options like BioPerl, Java, etc. which add complexity and run time when running large data sets of sequences. BLAST-QC is ideal for use in high-throughput workflows and pipelines common in bioinformatic and genomic research, and the script has been designed for portability and easy integration into whatever type of processes the user may be running.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8066848PMC
http://dx.doi.org/10.1186/s40793-020-00361-yDOI Listing

Publication Analysis

Top Keywords

blast
8
analysis blast
8
bioinformatics genomics
8
options
6
blast-qc
5
blast-qc automated
4
automated analysis
4
blast background
4
background basic
4
basic local
4

Similar Publications

Despite the study of BCR::ABL1-positive and -negative myeloproliferative neoplasms (MPNs) providing seminal insights into cancer biology, tumor evolution and precision oncology over the past half century, significant challenges remain. MPNs are clonal hematopoietic stem cell-derived neoplasms with heterogenous clinical phenotypes and a clonal architecture which impacts the often-complex underlying genetics and microenvironment. The major driving molecular abnormalities have been well characterized, but debate on their role as disease-initiating molecular lesions continues.

View Article and Find Full Text PDF

Evaluating the effectiveness of handheld ultrasound in primary blast lung injury: a comprehensive study.

Sci Rep

January 2025

Department of Military Traffic Injury Prevention and Control, Daping Hospital, Army Medical University, No. 10 Changjiang Branch Road, Yuzhong District, Chongqing, 400042, China.

The incidence of blast injuries has been rising globally, particularly affecting the lungs due to their vulnerability. Primary blast lung injury (PBLI) is associated with high morbidity and mortality rates, while early diagnostic methods are limited. With advancements in medical technology, and portable handheld ultrasound devices, the efficacy of ultrasound in detecting occult lung injuries early remains unclear.

View Article and Find Full Text PDF

Exploring treatment-driven subclonal evolution of prognostic triple biomarkers: Dual gene fusions and chimeric RNA variants in novel subtypes of acute myeloid leukemia patients with KMT2A rearrangement.

Drug Resist Updat

January 2025

Loma Linda University Cancer Center, Loma Linda, CA 92354, United States; Department of Basic Sciences, Loma Linda University, Loma Linda, CA 92354, United States. Electronic address:

Chromosomal rearrangements (CR) initiate leukemogenesis in approximately 50 % of acute myeloid leukemia (AML) patients; however, limited targeted therapies exist due to a lack of accurate molecular and genetic biomarkers of refractory mechanisms during treatment. Here, we investigated the pathological landscape of treatment resistance and relapse in 16 CR-AML patients by monitoring cytogenetic, RNAseq, and genome-wide changes among newly diagnosed, refractory, and relapsed AML. First, in FISH-diagnosed KMT2A (MLL gene, 11q23)/AFDN (AF6, 6q27)-rearrangement, RNA-sequencing identified an unknown CCDC32 (15q15.

View Article and Find Full Text PDF

A common problem when analyzing ancient DNA (aDNA) data is to identify the species which corresponds to the recovered aDNA sequence(s). The standard approach is to deploy sequence similarity based tools, such as BLAST. However, as aDNA reads may frequently stem from unsampled taxa due to extinction, it is likely that there is no exact match in any database.

View Article and Find Full Text PDF

Bacterial Leaf Blight (BLB) usually attacks rice in the flowering stage and can cause yield losses of up to 50% in severely infected fields. The resulting yield losses severely impact farmers, necessitating compensation from the regulatory authorities. This study introduces a new pipeline specifically designed for detecting BLB in rice fields using unmanned aerial vehicle (UAV) imagery.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!