Selection of Optimal Bioinformatic Tools and Proper Reference for Reducing the Alignment Error in Targeted Sequencing Data.

J Med Signals Sens

Department of Bioelectric and Biomedical Engineering, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran.

Published: January 2021

Background: Careful design in the primary steps of a next-generation sequencing study is critical for obtaining successful results in downstream analysis.

Methods: In this study, a framework is proposed to evaluate and improve the sequence mapping in targeted regions of the reference genome. In this regard, simulated short reads were produced from the coding regions of the human genome and mapped to a Customized Target-Based Reference (CTBR) by the alignment tools that have been introduced recently. The short reads produced by different sequencing technologies aligned to the standard genome and also CTBR with and without well-defined mutation types where the amount of unmapped and misaligned reads and runtime was measured for comparison.

Results: The results showed that the mapping accuracy of the reads generated from Illumina Hiseq2500 using Stampy as the alignment tool whenever the CTBR was used as reference was significantly better than other evaluated pipelines. Using CTBR for alignment significantly decreased the mapping error in comparison to other expanded or more limited references. While intentional mutations were imported in the reads, Stampy showed the minimum error of 1.67% using CTBR. However, the lowest error obtained by stampy too using whole genome and one chromosome as references was 3.78% and 20%, respectively. Maximum and minimum misalignment errors were observed on chromosome Y and 20, respectively.

Conclusion: Therefore using the proposed framework in a clinical targeted sequencing study may lead to predict the error and improve the performance of variant calling regarding the genomic regions targeted in a clinical study.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8043119PMC
http://dx.doi.org/10.4103/jmss.JMSS_7_20DOI Listing

Publication Analysis

Top Keywords

targeted sequencing
8
sequencing study
8
short reads
8
reads produced
8
ctbr alignment
8
error
5
reads
5
ctbr
5
selection optimal
4
optimal bioinformatic
4

Similar Publications

Identification and validation of up-regulated TNFAIP6 in osteoarthritis with type 2 diabetes mellitus.

Sci Rep

December 2024

Division of Joint Surgery and Sports Medicine, Department of Orthopedic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, 430071, China.

Lines of evidence have indicated that type 2 diabetes mellitus (T2DM) is an independent risk factor for osteoarthritis (OA) progression. However, the study focused on the relationship between T2DM and OA at the transcriptional level remains empty. We downloaded OA- and T2DM-related bulk RNA-sequencing and single-cell RNA sequencing data from the Gene Expression Omnibus (GEO) dataset.

View Article and Find Full Text PDF

Osteosarcoma (OS) is the most prevalent secondary sarcoma associated with retinoblastoma (RB). However, the molecular mechanisms driving the interactions between these two diseases remain incompletely understood. This study aims to explore the transcriptomic commonalities and molecular pathways shared by RB and OS, and to identify biomarkers that predict OS prognosis effectively.

View Article and Find Full Text PDF

Alzheimer's disease (AD) is a severe neurodegenerative disease, and the most common type of dementia, with symptoms of progressive cognitive dysfunction and behavioral impairment. Studying the pathogenesis of AD and exploring new targets for the prevention and treatment of AD is a very worthwhile challenge. Accumulating evidence has highlighted the effects of fatty acid metabolism on AD.

View Article and Find Full Text PDF

Ascochyta blight, caused by the necrotrophic fungus Ascochyta rabiei, is a major threat to chickpea production worldwide. Resistance genes with broad-spectrum protection against virulent A. rabiei strains are required to secure chickpea yield in the US Northern Great Plains.

View Article and Find Full Text PDF

The study aims to address the critical issue of toxic side effects resulting from drug combinations, which can significantly increase health risks, clinical complications, and lead to drug being withdrawn from the market. A model named TSEDDI (toxic side effects of drug-drug interaction) has been developed to improve the identification of drug pairs that may induce toxicity or adverse reactions. By utilizing drug chemical structures and diverse proteins, we employ a convolutional neural network (CNN) to extract features from molecular images, enzyme proteins, transporter proteins, and target proteins.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!