Most sequencing data analyses start by aligning sequencing reads to a linear reference genome, but failure to account for genetic variation leads to reference bias and confounding of results downstream. Other approaches replace the linear reference with structures like graphs that can include genetic variation, incurring major computational overhead. We propose the reference flow alignment method that uses multiple population reference genomes to improve alignment accuracy and reduce reference bias. Compared to the graph aligner vg, reference flow achieves a similar level of accuracy and bias avoidance but with 14% of the memory footprint and 5.5 times the speed.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7780692PMC
http://dx.doi.org/10.1186/s13059-020-02229-3DOI Listing

Publication Analysis

Top Keywords

reference flow
12
reference bias
12
reference
9
multiple population
8
linear reference
8
genetic variation
8
flow reducing
4
reducing reference
4
bias
4
bias multiple
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!