Full-length analysis of genes with highly repetitive sequences is challenging in two respects: assembly algorithm and sequencing accuracy. The de Bruijn graph often used in short-read assembly cannot distinguish adjacent repeat units. On the other hand, the accuracy of long reads is not yet high enough to identify each and every repeat unit. In this chapter, I present an example of a strategy to solve these problems and obtain the full length of long repeats by combining the extraction and assembly of repeat units based on overlap-layout-consensus and scaffolding by long reads.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1007/978-1-0716-2996-3_10 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!