SUGAR: graphical user interface-based data refiner for high-throughput DNA sequencing.

BMC Genomics

Department of Integrative Genomics, Tohoku Medial Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku Sendai, Miyagi, 980-8573, Japan.

Published: August 2014

AI Article Synopsis

  • Next-generation sequencers (NGSs) are crucial for modern biology, but low-quality data from technical errors can hinder useful insights.
  • A new software called SUGAR (subtile-based GUI-assisted refiner) allows users to interactively analyze and clean high-throughput sequencing data through a user-friendly interface, generating quality heatmaps to identify issues.
  • By improving sequence read mapping and enabling detailed evaluation of data, SUGAR enhances variant analysis, particularly in samples with technical errors such as those found in cancer studies.

Article Abstract

Background: Next-generation sequencers (NGSs) have become one of the main tools for current biology. To obtain useful insights from the NGS data, it is essential to control low-quality portions of the data affected by technical errors such as air bubbles in sequencing fluidics.

Results: We develop a software SUGAR (subtile-based GUI-assisted refiner) which can handle ultra-high-throughput data with user-friendly graphical user interface (GUI) and interactive analysis capability. The SUGAR generates high-resolution quality heatmaps of the flowcell, enabling users to find possible signals of technical errors during the sequencing. The sequencing data generated from the error-affected regions of a flowcell can be selectively removed by automated analysis or GUI-assisted operations implemented in the SUGAR. The automated data-cleaning function based on sequence read quality (Phred) scores was applied to a public whole human genome sequencing data and we proved the overall mapping quality was improved.

Conclusion: The detailed data evaluation and cleaning enabled by SUGAR would reduce technical problems in sequence read mapping, improving subsequent variant analysis that require high-quality sequence data and mapping results. Therefore, the software will be especially useful to control the quality of variant calls to the low population cells, e.g., cancers, in a sample with technical errors of sequencing procedures.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4133631PMC
http://dx.doi.org/10.1186/1471-2164-15-664DOI Listing

Publication Analysis

Top Keywords

technical errors
12
graphical user
8
data
8
errors sequencing
8
sequencing data
8
sequence read
8
sequencing
6
sugar
5
sugar graphical
4
user interface-based
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!