Somatic mosaicism is an important cause of disease, but mosaic and somatic variants are often challenging to detect because they exist in only a fraction of cells. To address the need for benchmarking subclonal variants in normal cell populations, we developed a benchmark containing mosaic variants in the Genome in a Bottle Consortium (GIAB) HG002 reference material DNA from a large batch of a normal lymphoblastoid cell line. First, we used a somatic variant caller with high coverage (300x) Illumina whole genome sequencing data from the Ashkenazi Jewish trio to detect variants in HG002 not detected in at least 5% of cells from the combined parental data.
View Article and Find Full Text PDFWhole genome sequencing has been an effective tool in the discovery of variants that cause rare diseases. In this study, we determined the suitability of a novel avidity sequencing approach for rare disease applications. We built a sample to results workflow, combining this sequencing technology with standard library preparation kits, analysis workflows, and interpretation tools.
View Article and Find Full Text PDFMaintenance of astronaut health during spaceflight will require monitoring and potentially modulating their microbiomes. However, documenting microbial shifts during spaceflight has been difficult due to mission constraints that lead to limited sampling and profiling. Here we executed a six-month longitudinal study to quantify the high-resolution human microbiome response to three days in orbit for four individuals.
View Article and Find Full Text PDFSpaceflight induces molecular, cellular and physiological shifts in astronauts and poses myriad biomedical challenges to the human body, which are becoming increasingly relevant as more humans venture into space. Yet current frameworks for aerospace medicine are nascent and lag far behind advancements in precision medicine on Earth, underscoring the need for rapid development of space medicine databases, tools and protocols. Here we present the Space Omics and Medical Atlas (SOMA), an integrated data and sample repository for clinical, cellular and multi-omic research profiles from a diverse range of missions, including the NASA Twins Study, JAXA CFE study, SpaceX Inspiration4 crew, Axiom and Polaris.
View Article and Find Full Text PDFMotivation: The acquisition of somatic mutations in hematopoietic stem and progenitor stem cells with resultant clonal expansion, termed clonal hematopoiesis (CH), is associated with increased risk of hematologic malignancies and other adverse outcomes. CH is generally present at low allelic fractions, but clonal expansion and acquisition of additional mutations leads to hematologic cancers in a small proportion of individuals. With high depth and high sensitivity sequencing, CH can be detected in most adults and its clonal trajectory mapped over time.
View Article and Find Full Text PDFThe genome is organized in functional compartments and structural domains at the sub-megabase scale. How within these domains interactions between numerous cis-acting enhancers and promoters regulate transcription remains an open question. Here, we determined chromatin folding and composition over several hundred kb around estrogen-responsive genes in human breast cancer cell lines after hormone stimulation.
View Article and Find Full Text PDFMaintenance of astronaut health during spaceflight will require monitoring and potentially modulating their microbiomes, which play a role in some space-derived health disorders. However, documenting the response of microbiota to spaceflight has been difficult thus far due to mission constraints that lead to limited sampling. Here, we executed a six-month longitudinal study centered on a three-day flight to quantify the high-resolution microbiome response to spaceflight.
View Article and Find Full Text PDFWe present avidity sequencing, a sequencing chemistry that separately optimizes the processes of stepping along a DNA template and that of identifying each nucleotide within the template. Nucleotide identification uses multivalent nucleotide ligands on dye-labeled cores to form polymerase-polymer-nucleotide complexes bound to clonal copies of DNA targets. These polymer-nucleotide substrates, termed avidites, decrease the required concentration of reporting nucleotides from micromolar to nanomolar and yield negligible dissociation rates.
View Article and Find Full Text PDFWhile many genetic diseases have effective treatments, they frequently progress rapidly to severe morbidity or mortality if those treatments are not implemented immediately. Since front-line physicians frequently lack familiarity with these diseases, timely molecular diagnosis may not improve outcomes. Herein we describe Genome-to-Treatment, an automated, virtual system for genetic disease diagnosis and acute management guidance.
View Article and Find Full Text PDFThe precisionFDA Truth Challenge V2 aimed to assess the state of the art of variant calling in challenging genomic regions. Starting with FASTQs, 20 challenge participants applied their variant-calling pipelines and submitted 64 variant call sets for one or more sequencing technologies (Illumina, PacBio HiFi, and Oxford Nanopore Technologies). Submissions were evaluated following best practices for benchmarking small variants with updated Genome in a Bottle benchmark sets and genome stratifications.
View Article and Find Full Text PDFRepeat expansions are responsible for over 40 monogenic disorders, and undoubtedly more pathogenic repeat expansions remain to be discovered. Existing methods for detecting repeat expansions in short-read sequencing data require predefined repeat catalogs. Recent discoveries emphasize the need for methods that do not require pre-specified candidate repeats.
View Article and Find Full Text PDFAnn Clin Transl Neurol
January 2020
Genetic white matter disorders have heterogeneous etiologies and overlapping clinical presentations. We performed a study of the diagnostic efficacy of genome sequencing in 41 unsolved cases with prior exome sequencing, resolving an additional 14 from an historical cohort (n = 191). Reanalysis in the context of novel disease-associated genes and improved variant curation and annotation resolved 64% of cases.
View Article and Find Full Text PDFTrans-homolog interactions have been studied extensively in Drosophila, where homologs are paired in somatic cells and transvection is prevalent. Nevertheless, the detailed structure of pairing and its functional impact have not been thoroughly investigated. Accordingly, we generated a diploid cell line from divergent parents and applied haplotype-resolved Hi-C, showing that homologs pair with varying precision genome-wide, in addition to establishing trans-homolog domains and compartments.
View Article and Find Full Text PDFGenome organization involves cis and trans chromosomal interactions, both implicated in gene regulation, development, and disease. Here, we focus on trans interactions in Drosophila, where homologous chromosomes are paired in somatic cells from embryogenesis through adulthood. We first address long-standing questions regarding the structure of embryonic homolog pairing and, to this end, develop a haplotype-resolved Hi-C approach to minimize homolog misassignment and thus robustly distinguish trans-homolog from cis contacts.
View Article and Find Full Text PDFAn Amendment to this paper has been published and can be accessed via a link at the top of the paper.
View Article and Find Full Text PDFThe nucleus of mammalian cells displays a distinct spatial segregation of active euchromatic and inactive heterochromatic regions of the genome. In conventional nuclei, microscopy shows that euchromatin is localized in the nuclear interior and heterochromatin at the nuclear periphery. Genome-wide chromosome conformation capture (Hi-C) analyses show this segregation as a plaid pattern of contact enrichment within euchromatin and heterochromatin compartments, and depletion between them.
View Article and Find Full Text PDFSummary: We describe a novel computational method for genotyping repeats using sequence graphs. This method addresses the long-standing need to accurately genotype medically important loci containing repeats adjacent to other variants or imperfect DNA repeats such as polyalanine repeats. Here we introduce a new version of our repeat genotyping software, ExpansionHunter, that uses this method to perform targeted genotyping of a broad class of such loci.
View Article and Find Full Text PDFColony stimulating factor 1 receptor (CSF1R) plays key roles in regulating development and function of the monocyte/macrophage lineage, including microglia and osteoclasts. Mono-allelic mutations of CSF1R are known to cause hereditary diffuse leukoencephalopathy with spheroids (HDLS), an adult-onset progressive neurodegenerative disorder. Here, we report seven affected individuals from three unrelated families who had bi-allelic CSF1R mutations.
View Article and Find Full Text PDFBackground: Hi-C is currently the most widely used assay to investigate the 3D organization of the genome and to study its role in gene regulation, DNA replication, and disease. However, Hi-C experiments are costly to perform and involve multiple complex experimental steps; thus, accurate methods for measuring the quality and reproducibility of Hi-C data are essential to determine whether the output should be used further in a study.
Results: Using real and simulated data, we profile the performance of several recently proposed methods for assessing reproducibility of population Hi-C data, including HiCRep, GenomeDISCO, HiC-Spector, and QuASAR-Rep.
Compared to noncoding RNAs (ncRNAs), such as rRNAs and ribozymes, for which high-resolution structures abound, little is known about the tertiary structures of mRNAs. In eukaryotic cells, newly made mRNAs are packaged with proteins in highly compacted mRNA particles (mRNPs), but the manner of this mRNA compaction is unknown. Here, we developed and implemented RIPPLiT (RNA immunoprecipitation and proximity ligation in tandem), a transcriptome-wide method for probing the 3D conformations of RNAs stably associated with defined proteins, in this case, exon junction complex (EJC) core factors.
View Article and Find Full Text PDFPurpose: Current diagnostic testing for genetic disorders involves serial use of specialized assays spanning multiple technologies. In principle, genome sequencing (GS) can detect all genomic pathogenic variant types on a single platform. Here we evaluate copy-number variant (CNV) calling as part of a clinically accredited GS test.
View Article and Find Full Text PDFStructural variants (SVs) can contribute to oncogenesis through a variety of mechanisms. Despite their importance, the identification of SVs in cancer genomes remains challenging. Here, we present a framework that integrates optical mapping, high-throughput chromosome conformation capture (Hi-C), and whole-genome sequencing to systematically detect SVs in a variety of normal or cancer samples and cell lines.
View Article and Find Full Text PDF