Here, we characterize the DNA methylation phenotypes of bone marrow cells from mice with hematopoietic deficiency of or (or both enzymes) or expressing the dominant-negative mutation [R882H in humans; the most common mutation found in acute myeloid leukemia (AML)]. Using these cells as substrates, we defined DNA remethylation after overexpressing wild-type (WT) DNMT3A1, DNMT3B1, DNMT3B3 (an inactive splice isoform of DNMT3B), or DNMT3L (a catalytically inactive "chaperone" for DNMT3A and DNMT3B in early embryogenesis). Overexpression of for 2 weeks reverses the hypomethylation phenotype of Dnmt3a-deficient cells or cells expressing the R878H mutation.
View Article and Find Full Text PDFHere the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels.
View Article and Find Full Text PDFPurpose: Persistent molecular disease (PMD) after induction chemotherapy predicts relapse in AML. In this study, we used whole-exome sequencing (WES) and targeted error-corrected sequencing to assess the frequency and mutational patterns of PMD in 30 patients with AML.
Materials And Methods: The study cohort included 30 patients with adult AML younger than 65 years who were uniformly treated with standard induction chemotherapy.
The 1000 Genomes Project (1kGP) is the largest fully open resource of whole-genome sequencing (WGS) data consented for public distribution without access or use restrictions. The final, phase 3 release of the 1kGP included 2,504 unrelated samples from 26 populations and was based primarily on low-coverage WGS. Here, we present a high-coverage 3,202-sample WGS 1kGP resource, which now includes 602 complete trios, sequenced to a depth of 30X using Illumina.
View Article and Find Full Text PDFMutations in the gene encoding DNA methyltransferase 3A () are the most common cause of clonal hematopoiesis and are among the most common initiating events of acute myeloid leukemia (AML). Studies in germline and somatic knockout mice have identified focal, canonical hypomethylation phenotypes in hematopoietic cells; however, the kinetics of methylation loss following acquired inactivation in hematopoietic cells is essentially unknown. Therefore, we evaluated a somatic, inducible model of hematopoietic loss, and show that inactivation of in murine hematopoietic cells results in a relatively slow loss of methylation at canonical sites throughout the genome; in contrast, remethylation of Dnmt3a deficient genomes in hematopoietic cells occurs much more quickly.
View Article and Find Full Text PDFGermline pathogenic variants in DNMT3A were recently described in patients with overgrowth, obesity, behavioral, and learning difficulties (DNMT3A Overgrowth Syndrome/DOS). Somatic mutations in the DNMT3A gene are also the most common cause of clonal hematopoiesis, and can initiate acute myeloid leukemia (AML). Using whole genome bisulfite sequencing, we studied DNA methylation in peripheral blood cells of 11 DOS patients and found a focal, canonical hypomethylation phenotype, which is most severe with the dominant negative DNMT3A mutation.
View Article and Find Full Text PDFThe contribution of genome structural variation (SV) to quantitative traits associated with cardiometabolic diseases remains largely unknown. Here, we present the results of a study examining genetic association between SVs and cardiometabolic traits in the Finnish population. We used sensitive methods to identify and genotype 129,166 high-confidence SVs from deep whole-genome sequencing (WGS) data of 4,848 individuals.
View Article and Find Full Text PDFLong-read and strand-specific sequencing technologies together facilitate the de novo assembly of high-quality haplotype-resolved human genomes without parent-child trio data. We present 64 assembled haplotypes from 32 diverse human genomes. These highly contiguous haplotype assemblies (average minimum contig length needed to cover 50% of the genome: 26 million base pairs) integrate all forms of genetic variation, even across complex loci.
View Article and Find Full Text PDFA key goal of whole-genome sequencing for studies of human genetics is to interrogate all forms of variation, including single-nucleotide variants, small insertion or deletion (indel) variants and structural variants. However, tools and resources for the study of structural variants have lagged behind those for smaller variants. Here we used a scalable pipeline to map and characterize structural variants in 17,795 deeply sequenced human genomes.
View Article and Find Full Text PDFAn Amendment to this paper has been published and can be accessed via a link at the top of the paper.
View Article and Find Full Text PDFExome-sequencing studies have generally been underpowered to identify deleterious alleles with a large effect on complex traits as such alleles are mostly rare. Because the population of northern and eastern Finland has expanded considerably and in isolation following a series of bottlenecks, individuals of these populations have numerous deleterious alleles at a relatively high frequency. Here, using exome sequencing of nearly 20,000 individuals from these regions, we investigate the role of rare coding variants in clinically relevant quantitative cardiometabolic traits.
View Article and Find Full Text PDFSummary: Large-scale human genetics studies are now employing whole genome sequencing with the goal of conducting comprehensive trait mapping analyses of all forms of genome variation. However, methods for structural variation (SV) analysis have lagged far behind those for smaller scale variants, and there is an urgent need to develop more efficient tools that scale to the size of human populations. Here, we present a fast and highly scalable software toolkit (svtools) and cloud-based pipeline for assembling high quality SV maps-including deletions, duplications, mobile element insertions, inversions and other rearrangements-in many thousands of human genomes.
View Article and Find Full Text PDFRecurrent genomic mutations in uterine and non-uterine leiomyosarcomas have not been well established. Using a next generation sequencing (NGS) panel of common cancer-associated genes, 25 leiomyosarcomas arising from multiple sites were examined to explore genetic alterations, including single nucleotide variants (SNV), small insertions/deletions (indels), and copy number alterations (CNA). Sequencing showed 86 non-synonymous, coding region somatic variants within 151 gene targets in 21 cases, with a mean of 4.
View Article and Find Full Text PDFSummary: Here we present SVScore, a tool for in silico structural variation (SV) impact prediction. SVScore aggregates per-base single nucleotide polymorphism (SNP) pathogenicity scores across relevant genomic intervals for each SV in a manner that considers variant type, gene features and positional uncertainty. We show that the allele frequency spectrum of high-scoring SVs is strongly skewed toward lower frequencies, suggesting that they are under purifying selection, and that SVScore identifies deleterious variants more effectively than alternative methods.
View Article and Find Full Text PDFQuality assurance for clinical next-generation sequencing (NGS)-based assays is difficult given the complex methods and the range of sequence variants such assays can detect. As the number and range of mutations detected by clinical NGS assays has increased, it is difficult to apply standard analyte-specific proficiency testing (PT). Most current proficiency testing challenges for NGS are methods-based PT surveys that use DNA from reference samples engineered to harbor specific mutations that test both sequence generation and bioinformatics analysis.
View Article and Find Full Text PDFContext: -Most current proficiency testing challenges for next-generation sequencing assays are methods-based proficiency testing surveys that use DNA from characterized reference samples to test both the wet-bench and bioinformatics/dry-bench aspects of the tests. Methods-based proficiency testing surveys are limited by the number and types of mutations that either are naturally present or can be introduced into a single DNA sample.
Objective: -To address these limitations by exploring a model of in silico proficiency testing in which sequence data from a single well-characterized specimen are manipulated electronically.
Background: The Long Life Family Study (LLFS) is an international study to identify the genetic components of various healthy aging phenotypes. We hypothesized that pedigree-specific rare variants at longevity-associated genes could have a similar functional impact on healthy phenotypes.
Methods: We performed custom hybridization capture sequencing to identify the functional variants in 464 candidate genes for longevity or the major diseases of aging in 615 pedigrees (4,953 individuals) from the LLFS, using a multiplexed, custom hybridization capture.
Objectives: To evaluate the extent of human-to-human specimen contamination in clinical next-generation sequencing (NGS) data.
Methods: Using haplotype analysis to detect specimen admixture, with orthogonal validation by short tandem repeat analysis, we determined the rate of clinically significant (>5%) DNA contamination in clinical NGS data from 296 consecutive cases. Haplotype analysis was performed using read haplotypes at common, closely spaced single-nucleotide polymorphisms in low linkage disequilibrium in the population, which were present in regions targeted by the clinical assay.
Background: T-cell receptor (TCR) clonality assessment is a principal diagnostic test in the management of mycosis fungoides (MF). However, current polymerase chain reaction-based methods may produce ambiguous results, often because of low abundance of clonal T lymphocytes, resulting in weak clonal peaks that cannot be size-resolved by contemporary capillary electrophoresis (CE).
Objective: We sought to determine if next-generation sequencing (NGS)-based detection has increased sensitivity for T-cell clonality over CE-based detection in MF.
Purpose: Malignant peripheral nerve sheath tumors (MPNST) occur at increased frequency in individuals with neurofibromatosis type 1 (NF1), where they likely arise from benign plexiform neurofibroma precursors. While previous studies have used a variety of discovery approaches to discover genes associated with MPNST pathogenesis, it is currently unclear what molecular events are associated with the evolution of MPNST from plexiform neurofibroma.
Experimental Design: Whole-exome sequencing was performed on biopsy materials representing plexiform neurofibroma (n = 3), MPNST, and metastasis from a single individual with NF1 over a 14-year period.
With the advent of large-scale genomic analysis, the genetic landscape of glioblastoma (GBM) has become more clear, including characteristic genetic alterations in EGFR. In routine clinical practice, genetic alterations in GBMs are identified using several disparate techniques that consume already limited amounts of tissue and add to overall testing costs. In this study, we sought to determine if the full spectrum of EGFR mutations in GBMs could be detected using a single next generation sequencing (NGS) based oncology assay in 34 consecutive cases.
View Article and Find Full Text PDFAfrican Americans are admixed with genetic contributions from European and African ancestral populations. Admixture mapping leverages this information to map genes influencing differential disease risk across populations. We performed admixture and association mapping in 3,300 African American current or former smokers from the COPDGene Study.
View Article and Find Full Text PDFTargeted next-generation sequencing (NGS) cancer panels have become a popular method for the identification of clinically predictive mutations in cancer. Such methods typically detect single nucleotide variants (SNVs) and small insertions/deletions (indels) in known cancer genes and can provide further information regarding diagnosis in challenging surgical pathology cases, as well as identify therapeutic targets and prognostically significant mutations. However, in addition to SNVs and indels, other mutation classes, including copy number variants (CNVs) and translocations, can be simultaneously detected from targeted NGS data.
View Article and Find Full Text PDF