With advances in next-generation sequencing, adapters attached to reads and low-quality bases directly and implicitly hinder downstream analysis. For example, they can produce false-positive single nucleotide polymorphisms (SNP), and generate fragmented assemblies. There is a need for a fast trimming algorithm to remove adapters precisely, especially in read tails with relatively low quality. Here, we present Atria, a trimming program that matches the adapters in paired reads and finds possible overlapped regions using a fast and carefully designed byte-based matching algorithm ( () time with (1) space). Atria also implements multi-threading in both sequence processing and file compression and supports single-end reads. Compared with other trimmers, Atria performs favorably in various trimming and runtime benchmarks of both simulated and real data. We also provide a fast and lightweight byte-based matching algorithm, which can be used in various short-sequence matching applications, such as primer search and seed scanning before alignment.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10038132 | PMC |
http://dx.doi.org/10.46471/gigabyte.31 | DOI Listing |
GigaByte
October 2021
Canadian Food Inspection Agency, Charlottetown, PE C1A5T1, Canada.
With advances in next-generation sequencing, adapters attached to reads and low-quality bases directly and implicitly hinder downstream analysis. For example, they can produce false-positive single nucleotide polymorphisms (SNP), and generate fragmented assemblies. There is a need for a fast trimming algorithm to remove adapters precisely, especially in read tails with relatively low quality.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!