Presented here is a genome sequence of an individual human. It was produced from approximately 32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb) of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel) included 3,213,401 single nucleotide polymorphisms (SNPs), 53,823 block substitutions (2-206 bp), 292,102 heterozygous insertion/deletion events (indels)(1-571 bp), 559,473 homozygous indels (1-82,711 bp), 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1964779PMC
http://dx.doi.org/10.1371/journal.pbio.0050254DOI Listing

Publication Analysis

Top Keywords

diploid genome
12
genome sequence
12
sequence individual
8
individual human
8
genome
8
diploid
5
sequence
4
human
4
human presented
4
presented genome
4

Similar Publications

Chromosome-scale genome assembly of three-spotted seahorse (Hippocampus trimaculatus) with a unique karyotype.

Sci Data

January 2025

Laboratory of Aquatic Genomics, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, 518057, China.

Three-spotted seahorse (Hippocampi trimaculata) is a unique fish with important economic and medicinal values, and its total chromosome number is potentially quite different from other seahorse species. Herein, we constructed a chromosome-level genome assembly for this special seahorse by integration of MGI short-read, PacBio HiFi long-read and Hi-C sequencing techniques. A 416.

View Article and Find Full Text PDF

Background And Aims: The cosmopolitan Botrychium lunaria group belong to the most species rich genus of the family Ophioglossaceae and was considered to consist of two species until molecular studies in North America and northern Europe led to the recognition of multiple new taxa. Recently, additional genetic lineages were found scattered in Europe, emphasizing our poor understanding of the global diversity of the B. lunaria group, while the processes involved in the diversification of the group remain unexplored.

View Article and Find Full Text PDF

Stinging nettles () have a long history of association with human civilization, having been used as a source of textile fibers, food and medicine. Here, we present a chromosome-level, phased genome assembly for a diploid female clone of from Romania. Using a combination of PacBio HiFi, Oxford Nanopore, and Illumina sequencing, as well as Hi-C long-range interaction data (using a novel Hi-C protocol presented here), we assembled two haplotypes of 574.

View Article and Find Full Text PDF

Evolutionary dynamics of mitochondrial genomes and intracellular transfers among diploid and allopolyploid cotton species.

BMC Biol

January 2025

Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China.

Background: Plant mitochondrial genomes (mitogenomes) exhibit extensive structural variation yet extremely low nucleotide mutation rates, phenomena that remain only partially understood. The genus Gossypium, a globally important source of cotton, offers a wealth of long-read sequencing resources to explore mitogenome and plastome variation and dynamics accompanying the evolutionary divergence of its approximately 50 diploid and allopolyploid species.

Results: Here, we assembled 19 mitogenomes from Gossypium species, representing all genome groups (diploids A through G, K, and the allopolyploids AD) based on a uniformly applied strategy.

View Article and Find Full Text PDF

Background: Drought stress is a significant global challenge that negatively impacts cotton fiber yield and quality. Although many drought-stress responsive genes have been identified in cotton species (Gossypium spp.), the diversity of drought response mechanisms across cotton species remains largely unexplored.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!