Robertsonian chromosomes are a type of variant chromosome found commonly in nature. Present in one in 800 humans, these chromosomes can underlie infertility, trisomies, and increased cancer incidence. Recognized cytogenetically for more than a century, their origins have remained mysterious.
View Article and Find Full Text PDFSince its initial release in 2000, the human reference genome has covered only the euchromatic fraction of the genome, leaving important heterochromatic regions unfinished. Addressing the remaining 8% of the genome, the Telomere-to-Telomere (T2T) Consortium presents a complete 3.055 billion-base pair sequence of a human genome, T2T-CHM13, that includes gapless assemblies for all chromosomes except Y, corrects errors in the prior references, and introduces nearly 200 million base pairs of sequence containing 1956 gene predictions, 99 of which are predicted to be protein coding.
View Article and Find Full Text PDFAfter two decades of improvements, the current human reference genome (GRCh38) is the most accurate and complete vertebrate genome ever produced. However, no single chromosome has been finished end to end, and hundreds of unresolved gaps persist. Here we present a human genome assembly that surpasses the continuity of GRCh38, along with a gapless, telomere-to-telomere assembly of a human chromosome.
View Article and Find Full Text PDFPublic health officials have raised concerns that plasmid transfer between Enterobacteriaceae species may spread resistance to carbapenems, an antibiotic class of last resort, thereby rendering common health care-associated infections nearly impossible to treat. To determine the diversity of carbapenemase-encoding plasmids and assess their mobility among bacterial species, we performed comprehensive surveillance and genomic sequencing of carbapenem-resistant Enterobacteriaceae in the National Institutes of Health (NIH) Clinical Center patient population and hospital environment. We isolated a repertoire of carbapenemase-encoding Enterobacteriaceae, including multiple strains of Klebsiella pneumoniae, Klebsiella oxytoca, Escherichia coli, Enterobacter cloacae, Citrobacter freundii, and Pantoea species.
View Article and Find Full Text PDFAlthough a considerable proportion of serum lipids loci identified in European ancestry individuals (EA) replicate in African Americans (AA), interethnic differences in the distribution of serum lipids suggest that some genetic determinants differ by ethnicity. We conducted a comprehensive evaluation of five lipid candidate genes to identify variants with ethnicity-specific effects. We sequenced ABCA1, LCAT, LPL, PON1, and SERPINE1 in 48 AA individuals with extreme serum lipid concentrations (high HDLC/low TG or low HDLC/high TG).
View Article and Find Full Text PDFBackground: While Staphylococcus epidermidis is commonly isolated from healthy human skin, it is also the most frequent cause of nosocomial infections on indwelling medical devices. Despite its importance, few genome sequences existed and the most frequent hospital-associated lineage, ST2, had not been fully sequenced.
Results: We cultivated 71 commensal S.
Ciliary dysfunction leads to a broad range of overlapping phenotypes, collectively termed ciliopathies. This grouping is underscored by genetic overlap, where causal genes can also contribute modifier alleles to clinically distinct disorders. Here we show that mutations in TTC21B, which encodes the retrograde intraflagellar transport protein IFT139, cause both isolated nephronophthisis and syndromic Jeune asphyxiating thoracic dystrophy.
View Article and Find Full Text PDFBackground: The approaches for shotgun-based sequencing of vertebrate genomes are now well-established, and have resulted in the generation of numerous draft whole-genome sequence assemblies. In contrast, the process of refining those assemblies to improve contiguity and increase accuracy (known as 'sequence finishing') remains tedious, labor-intensive, and expensive. As a result, the vast majority of vertebrate genome sequences generated to date remain at a draft stage.
View Article and Find Full Text PDFSince its start, the Mammalian Gene Collection (MGC) has sought to provide at least one full-protein-coding sequence cDNA clone for every human and mouse gene with a RefSeq transcript, and at least 6200 rat genes. The MGC cloning effort initially relied on random expressed sequence tag screening of cDNA libraries. Here, we summarize our recent progress using directed RT-PCR cloning and DNA synthesis.
View Article and Find Full Text PDFClinSeq is a pilot project to investigate the use of whole-genome sequencing as a tool for clinical research. By piloting the acquisition of large amounts of DNA sequence data from individual human subjects, we are fostering the development of hypothesis-generating approaches for performing research in genomic medicine, including the exploration of issues related to the genetic architecture of disease, implementation of genomic technology, informed consent, disclosure of genetic information, and archiving, analyzing, and displaying sequence data. In the initial phase of ClinSeq, we are enrolling roughly 1000 participants; the evaluation of each includes obtaining a detailed family and medical history, as well as a clinical evaluation.
View Article and Find Full Text PDFHuman skin is a large, heterogeneous organ that protects the body from pathogens while sustaining microorganisms that influence human health and disease. Our analysis of 16S ribosomal RNA gene sequences obtained from 20 distinct skin sites of healthy humans revealed that physiologically comparable sites harbor similar bacterial communities. The complexity and stability of the microbial community are dependent on the specific characteristics of the skin site.
View Article and Find Full Text PDFThe many layers and structures of the skin serve as elaborate hosts to microbes, including a diversity of commensal and pathogenic bacteria that contribute to both human health and disease. To determine the complexity and identity of the microbes inhabiting the skin, we sequenced bacterial 16S small-subunit ribosomal RNA genes isolated from the inner elbow of five healthy human subjects. This analysis revealed 113 operational taxonomic units (OTUs; "phylotypes") at the level of 97% similarity that belong to six bacterial divisions.
View Article and Find Full Text PDFIt is understood that DNA and amino acid substitution rates are highly sequence context-dependent, e.g., C --> T substitutions in vertebrates may occur much more frequently at CpG sites and that cysteine substitution rates may depend on support of the context for participation in a disulfide bond.
View Article and Find Full Text PDFA key component of the ongoing ENCODE project involves rigorous comparative sequence analyses for the initially targeted 1% of the human genome. Here, we present orthologous sequence generation, alignment, and evolutionary constraint analyses of 23 mammalian species for all ENCODE targets. Alignments were generated using four different methods; comparisons of these methods reveal large-scale consistency but substantial differences in terms of small genomic rearrangements, sensitivity (sequence coverage), and specificity (alignment accuracy).
View Article and Find Full Text PDFSequencing of full-insert clones from full-length cDNA libraries from both Xenopus laevis and Xenopus tropicalis has been ongoing as part of the Xenopus Gene Collection Initiative. Here we present 10,967 full ORF verified cDNA clones (8049 from X. laevis and 2918 from X.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
January 2006
Identification of the specific cytogenetic abnormality is one of the critical steps for classification of acute myeloblastic leukemia (AML) which influences the selection of appropriate therapy and provides information about disease prognosis. However at present, the genetic complexity of AML is only partially understood. To obtain a comprehensive, unbiased, quantitative measure, we performed serial analysis of gene expression (SAGE) on CD15(+) myeloid progenitor cells from 22 AML patients who had four of the most common translocations, namely t(8;21), t(15;17), t(9;11), and inv(16).
View Article and Find Full Text PDFThe National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5'-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project.
View Article and Find Full Text PDFAlthough the cost of generating draft-quality genomic sequence continues to decline, refining that sequence by the process of "sequence finishing" remains expensive. Near-perfect finished sequence is an appropriate goal for the human genome and a small set of reference genomes; however, such a high-quality product cannot be cost-justified for large numbers of additional genomes, at least for the foreseeable future. Here we describe the generation and quality of an intermediate grade of finished genomic sequence (termed comparative-grade finished sequence), which is tailored for use in multispecies sequence comparisons.
View Article and Find Full Text PDFThe laboratory rat (Rattus norvegicus) is an indispensable tool in experimental medicine and drug development, having made inestimable contributions to human health. We report here the genome sequence of the Brown Norway (BN) rat strain. The sequence represents a high-quality 'draft' covering over 90% of the genome.
View Article and Find Full Text PDFAnalysis of the human genome sequence has identified approximately 25000-30000 protein-coding genes, but little is known about how most of these are regulated. Mapping DNase I hypersensitive (HS) sites has traditionally represented the gold-standard experimental method for identifying regulatory elements, but the labor-intensive nature of this technique has limited its application to only a small number of human genes. We have developed a protocol to generate a genome-wide library of gene regulatory sequences by cloning DNase HS sites.
View Article and Find Full Text PDFNucleic Acids Res
January 2004
Hembase (http://hembase.niddk.nih.
View Article and Find Full Text PDF