We deployed the Blended Genome Exome (BGE), a DNA library blending approach that generates low pass whole genome (1-4× mean depth) and deep whole exome (30-40× mean depth) data in a single sequencing run. This technology is cost-effective, empowers most genomic discoveries possible with deep whole genome sequencing, and provides an unbiased method to capture the diversity of common SNP variation across the globe. To evaluate this new technology at scale, we applied BGE to sequence >53,000 samples from the Populations Underrepresented in Mental Illness Associations Studies (PUMAS) Project, which included participants across African, African American, and Latin American populations.
View Article and Find Full Text PDFGenomic scientists have long been promised cheaper DNA sequencing, but deep whole genomes are still costly, especially when considered for large cohorts in population-level studies. More affordable options include microarrays + imputation, whole exome sequencing (WES), or low-pass whole genome sequencing (WGS) + imputation. WES + array + imputation has recently been shown to yield 99% of association signals detected by WGS.
View Article and Find Full Text PDFAn outbreak of over 1,000 COVID-19 cases in Provincetown, Massachusetts (MA), in July 2021-the first large outbreak mostly in vaccinated individuals in the US-prompted a comprehensive public health response, motivating changes to national masking recommendations and raising questions about infection and transmission among vaccinated individuals. To address these questions, we combined viral genomic and epidemiological data from 467 individuals, including 40% of outbreak-associated cases. The Delta variant accounted for 99% of cases in this dataset; it was introduced from at least 40 sources, but 83% of cases derived from a single source, likely through transmission across multiple settings over a short time rather than a single event.
View Article and Find Full Text PDFMultiple summer events, including large indoor gatherings, in Provincetown, Massachusetts (MA), in July 2021 contributed to an outbreak of over one thousand COVID-19 cases among residents and visitors. Most cases were fully vaccinated, many of whom were also symptomatic, prompting a comprehensive public health response, motivating changes to national masking recommendations, and raising questions about infection and transmission among vaccinated individuals. To characterize the outbreak and the viral population underlying it, we combined genomic and epidemiological data from 467 individuals, including 40% of known outbreak-associated cases.
View Article and Find Full Text PDFPurpose: Existing cell-free DNA (cfDNA) methods lack the sensitivity needed for detecting minimal residual disease (MRD) following therapy. We developed a test for tracking hundreds of patient-specific mutations to detect MRD with a 1,000-fold lower error rate than conventional sequencing.
Experimental Design: We compared the sensitivity of our approach to digital droplet PCR (ddPCR) in a dilution series, then retrospectively identified two cohorts of patients who had undergone prospective plasma sampling and clinical data collection: 16 patients with ER+/HER2- metastatic breast cancer (MBC) sampled within 6 months following metastatic diagnosis and 142 patients with stage 0 to III breast cancer who received curative-intent treatment with most sampled at surgery and 1 year postoperative.
Although genetic lesions responsible for some mendelian disorders can be rapidly discovered through massively parallel sequencing of whole genomes or exomes, not all diseases readily yield to such efforts. We describe the illustrative case of the simple mendelian disorder medullary cystic kidney disease type 1 (MCKD1), mapped more than a decade ago to a 2-Mb region on chromosome 1. Ultimately, only by cloning, capillary sequencing and de novo assembly did we find that each of six families with MCKD1 harbors an equivalent but apparently independently arising mutation in sequence markedly under-represented in massively parallel sequencing data: the insertion of a single cytosine in one copy (but a different copy in each family) of the repeat unit comprising the extremely long (∼1.
View Article and Find Full Text PDFCongenital diarrheal disorders (CDDs) are a collection of rare, heterogeneous enteropathies with early onset and often severe outcomes. Here, we report a family of Ashkenazi Jewish descent, with 2 out of 3 children affected by CDD. Both affected children presented 3 days after birth with severe, intractable diarrhea.
View Article and Find Full Text PDFUnlabelled: Knowledge of "actionable" somatic genomic alterations present in each tumor (e.g., point mutations, small insertions/deletions, and copy-number alterations that direct therapeutic options) should facilitate individualized approaches to cancer treatment.
View Article and Find Full Text PDFThis unit describes a protocol for the targeted enrichment of exons from randomly sheared genomic DNA libraries using an in-solution hybrid selection approach for sequencing on an Illumina Genome Analyzer II. The steps for designing and ordering a hybrid selection oligo pool are reviewed, as are critical steps for performing the preparation and hybrid selection of an Illumina paired-end library. Critical parameters, performance metrics, and analysis workflow are discussed.
View Article and Find Full Text PDFTo identify susceptibility loci for bipolar disorder, we tested 1.8 million variants in 4,387 cases and 6,209 controls and identified a region of strong association (rs10994336, P = 9.1 x 10(-9)) in ANK3 (ankyrin G).
View Article and Find Full Text PDFWe describe the Phase II HapMap, which characterizes over 3.1 million human single nucleotide polymorphisms (SNPs) genotyped in 270 individuals from four geographically diverse populations and includes 25-35% of common SNP variation in the populations surveyed. The map is estimated to capture untyped common variation with an average maximum r2 of between 0.
View Article and Find Full Text PDFNew strategies for prevention and treatment of type 2 diabetes (T2D) require improved insight into disease etiology. We analyzed 386,731 common single-nucleotide polymorphisms (SNPs) in 1464 patients with T2D and 1467 matched controls, each characterized for measures of glucose metabolism, lipids, obesity, and blood pressure. With collaborators (FUSION and WTCCC/UKT2D), we identified and confirmed three loci associated with T2D-in a noncoding region near CDKN2A and CDKN2B, in an intron of IGF2BP2, and an intron of CDKAL1-and replicated associations near HHEX and in SLC30A8 found by a recent whole-genome association study.
View Article and Find Full Text PDFHaplotype-based methods offer a powerful approach to disease gene mapping, based on the association between causal mutations and the ancestral haplotypes on which they arose. As part of The SNP Consortium Allele Frequency Projects, we characterized haplotype patterns across 51 autosomal regions (spanning 13 megabases of the human genome) in samples from Africa, Europe, and Asia. We show that the human genome can be parsed objectively into haplotype blocks: sizable regions over which there is little evidence for historical recombination and within which only a few common haplotypes are observed.
View Article and Find Full Text PDF