NucBreak: location of structural errors in a genome assembly by using paired-end Illumina reads.

BMC Bioinformatics

Biomedical Informatics Research Group, Department of Informatics, University of Oslo, PO Box 1080 Blindern, NO-0316, Oslo, Norway.

Published: February 2020

Background: Advances in whole genome sequencing strategies have provided the opportunity for genomic and comparative genomic analysis of a vast variety of organisms. The analysis results are highly dependent on the quality of the genome assemblies used. Assessment of the assembly accuracy may significantly increase the reliability of the analysis results and is therefore of great importance.

Results: Here, we present a new tool called NucBreak aimed at localizing structural errors in assemblies, including insertions, deletions, duplications, inversions, and different inter- and intra-chromosomal rearrangements. The approach taken by existing alternative tools is based on analysing reads that do not map properly to the assembly, for instance discordantly mapped reads, soft-clipped reads and singletons. NucBreak uses an entirely different and unique method to localise the errors. It is based on analysing the alignments of reads that are properly mapped to an assembly and exploit information about the alternative read alignments. It does not annotate detected errors. We have compared NucBreak with other existing assembly accuracy assessment tools, namely Pilon, REAPR, and FRCbam as well as with several structural variant detection tools, including BreakDancer, Lumpy, and Wham, by using both simulated and real datasets.

Conclusions: The benchmarking results have shown that NucBreak in general predicts assembly errors of different types and sizes with relatively high sensitivity and with lower false discovery rate than the other tools. Such a balance between sensitivity and false discovery rate makes NucBreak a good alternative to the existing assembly accuracy assessment tools and SV detection tools. NucBreak is freely available at https://github.com/uio-bmi/NucBreak under the MPL license.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7035700PMC
http://dx.doi.org/10.1186/s12859-020-3414-0DOI Listing

Publication Analysis

Top Keywords

assembly accuracy
12
structural errors
8
based analysing
8
existing assembly
8
accuracy assessment
8
assessment tools
8
detection tools
8
false discovery
8
discovery rate
8
nucbreak
7

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!