The release of 150,119 UK Biobank sequences represents an unprecedented opportunity as a reference panel to impute low-coverage whole-genome sequencing data with high accuracy but current methods cannot cope with the size of the data. Here we introduce GLIMPSE2, a low-coverage whole-genome sequencing imputation method that scales sublinearly in both the number of samples and markers, achieving efficient whole-genome imputation from the UK Biobank reference panel while retaining high accuracy for ancient and modern genomes, particularly at rare variants and for very low-coverage samples.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10335927PMC
http://dx.doi.org/10.1038/s41588-023-01438-3DOI Listing

Publication Analysis

Top Keywords

sequencing data
8
150119 biobank
8
reference panel
8
low-coverage whole-genome
8
whole-genome sequencing
8
high accuracy
8
imputation low-coverage
4
low-coverage sequencing
4
data 150119
4
biobank genomes
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!