Background: Pacific Biosciences HiFi read technology is currently the industry standard for high accuracy long-read sequencing that has been widely adopted by large sequencing and assembly initiatives for generation of de novo assemblies in non-model organisms. Though adapter contamination filtering is routine in traditional short-read analysis pipelines, it has not been widely adopted for HiFi workflows.

Results: Analysis of 55 publicly available HiFi datasets revealed that a read-sanitation step to remove sequence artifacts derived from PacBio library preparation from read pools is necessary as adapter sequences can be erroneously integrated into assemblies.

Conclusions: Here we describe the nature of adapter contaminated reads, their consequences in assembly, and present HiFiAdapterFilt, a simple and memory efficient solution for removing adapter contaminated reads prior to assembly.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8864876PMC
http://dx.doi.org/10.1186/s12864-022-08375-1DOI Listing

Publication Analysis

Top Keywords

memory efficient
8
adapter contaminated
8
contaminated reads
8
adapter
5
hifiadapterfilt memory
4
efficient read
4
read processing
4
processing pipeline
4
pipeline prevents
4
prevents occurrence
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!