Despite their importance in disease and evolution, highly identical segmental duplications (SDs) are among the last regions of the human reference genome (GRCh38) to be fully sequenced. Using a complete telomere-to-telomere human genome (T2T-CHM13), we present a comprehensive view of human SD organization. SDs account for nearly one-third of the additional sequence, increasing the genome-wide estimate from 5.4 to 7.0% [218 million base pairs (Mbp)]. An analysis of 268 human genomes shows that 91% of the previously unresolved T2T-CHM13 SD sequence (68.3 Mbp) better represents human copy number variation. Comparing long-read assemblies from human ( = 12) and nonhuman primate ( = 5) genomes, we systematically reconstruct the evolution and structural haplotype diversity of biomedically relevant and duplicated genes. This analysis reveals patterns of structural heterozygosity and evolutionary differences in SD organization between humans and other primates.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8979283PMC
http://dx.doi.org/10.1126/science.abj6965DOI Listing

Publication Analysis

Top Keywords

segmental duplications
8
human genome
8
human
7
duplications variation
4
variation complete
4
complete human
4
genome despite
4
despite disease
4
disease evolution
4
evolution highly
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!