Publications by Mitchell R Vollger | LitMetric

Publications by authors named "Mitchell R Vollger"

Page 1 of 2

Resolving the chromatin impact of mosaic variants with targeted Fiber-seq.

Stephanie C Bohaczuk Zachary J Amador Chang Li Benjamin J Mallory Elliott G Swanson Mitchell R Vollger

Genome Res

December 2024

Accurately quantifying the functional consequences of noncoding mosaic variants requires the pairing of DNA sequences with both accessible and closed chromatin architectures along individual DNA molecules-a pairing that cannot be achieved using traditional fragmentation-based chromatin assays. We demonstrate that targeted single-molecule chromatin fiber sequencing (Fiber-seq) achieves this, permitting single-molecule, long-read genomic, and epigenomic profiling across targeted >100 kb loci with ∼10-fold enrichment over untargeted sequencing. Targeted Fiber-seq reveals that pathogenic expansions of the CTG repeat that underlie Myotonic Dystrophy 1 are characterized by somatic instability and disruption of multiple nearby regulatory elements, both of which are repeat length-dependent.

View Article and Find Full Text PDF

Single chromatin fiber profiling and nucleosome position mapping in the human brain.

Cyril J Peter Aman Agarwal Risa Watanabe Bibi S Kassim Xuedi Wang Mitchell R Vollger

Cell Rep Methods

December 2024

We apply a single-molecule chromatin fiber sequencing (Fiber-seq) protocol designed for amplification-free cell-type-specific mapping of the regulatory architecture at nucleosome resolution along extended ∼10-kb chromatin fibers to neuronal and non-neuronal nuclei sorted from human brain tissue. Specifically, application of this method enables the resolution of cell-selective promoter and enhancer architectures on single fibers, including transcription factor footprinting and position mapping, with sequence-specific fixation of nucleosome arrays flanking transcription start sites and regulatory motifs. We uncover haplotype-specific chromatin patterns, multiple regulatory elements cis-aligned on individual fibers, and accessible chromatin at 20,000 unique sites encompassing retrotransposons and other repeat sequences hitherto "unmappable" by short-read epigenomic sequencing.

View Article and Find Full Text PDF

Resolving the chromatin impact of mosaic variants with targeted Fiber-seq.

Stephanie C Bohaczuk Zachary J Amador Chang Li Benjamin J Mallory Elliott G Swanson Mitchell R Vollger

bioRxiv

July 2024

Accurately quantifying the functional consequences of non-coding mosaic variants requires the pairing of DNA sequence with both accessible and closed chromatin architectures along individual DNA molecules-a pairing that cannot be achieved using traditional fragmentation-based chromatin assays. We demonstrate that targeted single-molecule chromatin fiber sequencing (Fiber-seq) achieves this, permitting single-molecule, long-read genomic and epigenomic profiling across targeted >100 kilobase loci with ~10-fold enrichment over untargeted sequencing. Targeted Fiber-seq reveals that pathogenic expansions of the CTG repeat that underlie Myotonic Dystrophy 1 are characterized by somatic instability and disruption of multiple nearby regulatory elements, both of which are repeat length-dependent.

View Article and Find Full Text PDF

The regulatory potential of transposable elements in maize.

Kerry L Bubb Morgan O Hamm Joseph K Min Bryan Ramirez-Corona Nicholas A Mueth Mitchell R Vollger

bioRxiv

July 2024

Since their initial discovery in maize, transposable elements (TEs) have emerged as being integral to the evolution of maize, accounting for 80% of its genome. However, the repetitive nature of TEs has hindered our understanding of their regulatory potential. Here, we demonstrate that long-read chromatin fiber sequencing (Fiber-seq) permits the comprehensive annotation of the regulatory potential of maize TEs.

View Article and Find Full Text PDF

DNA-m6A calling and integrated long-read epigenetic and genetic analysis with .

Anupama Jha Stephanie C Bohaczuk Yizi Mao Jane Ranchalis Benjamin J Mallory Mitchell R Vollger

Genome Res

November 2024

Long-read DNA sequencing has recently emerged as a powerful tool for studying both genetic and epigenetic architectures at single-molecule and single-nucleotide resolution. Long-read epigenetic studies encompass both the direct identification of native cytosine methylation and the identification of exogenously placed DNA -methyladenine (DNA-m6A). However, detecting DNA-m6A modifications using single-molecule sequencing, as well as coprocessing single-molecule genetic and epigenetic architectures, is limited by computational demands and a lack of supporting tools.

View Article and Find Full Text PDF

STR mutations on chromosome 15q cause thyrotropin resistance by activating a primate-specific enhancer of MIR7-2/MIR1179.

Helmut Grasberger Alexandra M Dumitrescu Xiao-Hui Liao Elliott G Swanson Roy E Weiss Mitchell R Vollger

Nat Genet

May 2024

Thyrotropin (TSH) is the master regulator of thyroid gland growth and function. Resistance to TSH (RTSH) describes conditions with reduced sensitivity to TSH. Dominantly inherited RTSH has been linked to a locus on chromosome 15q, but its genetic basis has remained elusive.

View Article and Find Full Text PDF

Synchronized long-read genome, methylome, epigenome, and transcriptome for resolving a Mendelian condition.

Mitchell R Vollger Jonas Korlach Kiara C Eldred Elliott Swanson Jason G Underwood

bioRxiv

September 2023

Resolving the molecular basis of a Mendelian condition (MC) remains challenging owing to the diverse mechanisms by which genetic variants cause disease. To address this, we developed a synchronized long-read genome, methylome, epigenome, and transcriptome sequencing approach, which enables accurate single-nucleotide, insertion-deletion, and structural variant calling and diploid genome assembly, and permits the simultaneous elucidation of haplotype-resolved CpG methylation, chromatin accessibility, and full-length transcript information in a single long-read sequencing run. Application of this approach to an Undiagnosed Diseases Network (UDN) participant with a chromosome X;13 balanced translocation of uncertain significance revealed that this translocation disrupted the functioning of four separate genes (, , , and ) previously associated with single-gene MCs.

View Article and Find Full Text PDF

The complete sequence of a human Y chromosome.

Arang Rhie Sergey Nurk Monika Cechova Savannah J Hoyt Dylan J Taylor Mitchell R Vollger

Nature

September 2023

The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure that includes long palindromes, tandem repeats and segmental duplications. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029-base-pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, showing the complete ampliconic structures of gene families TSPY, DAZ and RBMY; 41 additional protein-coding genes, mostly from the TSPY family; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region.

View Article and Find Full Text PDF

Characterization of large-scale genomic differences in the first complete human genome.

Xiangyu Yang Xuankai Wang Yawen Zou Shilong Zhang Manying Xia Mitchell R Vollger

Genome Biol

July 2023

Background: The first telomere-to-telomere (T2T) human genome assembly (T2T-CHM13) release is a milestone in human genomics. The T2T-CHM13 genome assembly extends our understanding of telomeres, centromeres, segmental duplication, and other complex regions. The current human genome reference (GRCh38) has been widely used in various human genomic studies.

View Article and Find Full Text PDF

A draft human pangenome reference.

Wen-Wei Liao Mobin Asri Jana Ebler Daniel Doerr Marina Haukness Mitchell R Vollger

Nature

May 2023

Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels.

View Article and Find Full Text PDF

Increased mutation and gene conversion within human segmental duplications.

Mitchell R Vollger Philip C Dishuck William T Harvey William S DeWitt Xavi Guitart

Nature

May 2023

Single-nucleotide variants (SNVs) in segmental duplications (SDs) have not been systematically assessed because of the limitations of mapping short-read sequencing data. Here we constructed 1:1 unambiguous alignments spanning high-identity SDs across 102 human haplotypes and compared the pattern of SNVs between unique and duplicated regions. We find that human SNVs are elevated 60% in SDs compared to unique regions and estimate that at least 23% of this increase is due to interlocus gene conversion (IGC) with up to 4.

View Article and Find Full Text PDF

Gaps and complex structurally variant loci in phased genome assemblies.

David Porubsky Mitchell R Vollger William T Harvey Allison N Rozanski Peter Ebert

Genome Res

April 2023

Article Synopsis

Recent advancements in phased genome assembly, especially using long-read data and parental information, still leave significant gaps, averaging over 140 per assembly from trio-hifiasm methods.
A comprehensive analysis of 182 haploid assemblies shows that chromosome-wide accuracy is similar when using Strand-seq instead of parental data, with many gaps clustering near large repeat regions.
The research highlights that a considerable amount of human DNA is misoriented and includes notable variations like deletions and insertions, suggesting key areas for future algorithm improvements and better pangenome models.

View Article and Find Full Text PDF

DNA-m6A calling and integrated long-read epigenetic and genetic analysis with fibertools.

Anupama Jha Stephanie C Bohaczuk Yizi Mao Jane Ranchalis Benjamin J Mallory Mitchell R Vollger

bioRxiv

December 2023

Long-read DNA sequencing has recently emerged as a powerful tool for studying both genetic and epigenetic architectures at single-molecule and single-nucleotide resolution. Long-read epigenetic studies encompass both the direct identification of native cytosine methylation as well as the identification of exogenously placed DNA -methyladenine (DNA-m6A). However, detecting DNA-m6A modifications using single-molecule sequencing, as well as co-processing single-molecule genetic and epigenetic architectures, is limited by computational demands and a lack of supporting tools.

View Article and Find Full Text PDF

Semi-automated assembly of high-quality diploid human reference genomes.

Erich D Jarvis Giulio Formenti Arang Rhie Andrea Guarracino Chentao Yang Mitchell R Vollger

Nature

November 2022

The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has benefitted society. However, it still has many gaps and errors, and does not represent a biological genome as it is a blend of multiple individuals. Recently, a high-quality telomere-to-telomere reference, CHM13, was generated with the latest long-read technologies, but it was derived from a hydatidiform mole cell line with a nearly homozygous genome.

View Article and Find Full Text PDF

A complete reference genome improves analysis of human genetic variation.

Sergey Aganezov Stephanie M Yan Daniela C Soto Melanie Kirsche Samantha Zarate Mitchell R Vollger

Science

April 2022

Compared to its predecessors, the Telomere-to-Telomere CHM13 genome adds nearly 200 million base pairs of sequence, corrects thousands of structural errors, and unlocks the most complex regions of the human genome for clinical and functional study. We show how this reference universally improves read mapping and variant calling for 3202 and 17 globally diverse samples sequenced with short and long reads, respectively. We identify hundreds of thousands of variants per sample in previously unresolved regions, showcasing the promise of the T2T-CHM13 reference for evolutionary and biomedical discovery.

View Article and Find Full Text PDF

From telomere to telomere: The transcriptional and epigenetic state of human repeat elements.

Savannah J Hoyt Jessica M Storer Gabrielle A Hartley Patrick G S Grady Ariel Gershman Mitchell R Vollger

Science

April 2022

Mobile elements and repetitive genomic regions are sources of lineage-specific genomic innovation and uniquely fingerprint individual genomes. Comprehensive analyses of such repeat elements, including those found in more complex regions of the genome, require a complete, linear genome assembly. We present a de novo repeat discovery and annotation of the T2T-CHM13 human reference genome.

View Article and Find Full Text PDF

The complete sequence of a human genome.

Sergey Nurk Sergey Koren Arang Rhie Mikko Rautiainen Andrey V Bzikadze Mitchell R Vollger

Science

April 2022

Since its initial release in 2000, the human reference genome has covered only the euchromatic fraction of the genome, leaving important heterochromatic regions unfinished. Addressing the remaining 8% of the genome, the Telomere-to-Telomere (T2T) Consortium presents a complete 3.055 billion-base pair sequence of a human genome, T2T-CHM13, that includes gapless assemblies for all chromosomes except Y, corrects errors in the prior references, and introduces nearly 200 million base pairs of sequence containing 1956 gene predictions, 99 of which are predicted to be protein coding.

View Article and Find Full Text PDF

Segmental duplications and their variation in a complete human genome.

Mitchell R Vollger Xavi Guitart Philip C Dishuck Ludovica Mercuri William T Harvey

Science

April 2022

Despite their importance in disease and evolution, highly identical segmental duplications (SDs) are among the last regions of the human reference genome (GRCh38) to be fully sequenced. Using a complete telomere-to-telomere human genome (T2T-CHM13), we present a comprehensive view of human SD organization. SDs account for nearly one-third of the additional sequence, increasing the genome-wide estimate from 5.

View Article and Find Full Text PDF

Epigenetic patterns in a complete human genome.

Ariel Gershman Michael E G Sauria Xavi Guitart Mitchell R Vollger Paul W Hook

Science

April 2022

The completion of a telomere-to-telomere human reference genome, T2T-CHM13, has resolved complex regions of the genome, including repetitive and homologous regions. Here, we present a high-resolution epigenetic study of previously unresolved sequences, representing entire acrocentric chromosome short arms, gene family expansions, and a diverse collection of repeat classes. This resource precisely maps CpG methylation (32.

View Article and Find Full Text PDF

Complete genomic and epigenetic maps of human centromeres.

Nicolas Altemose Glennis A Logsdon Andrey V Bzikadze Pragya Sidhwani Sasha A Langley Mitchell R Vollger

Science

April 2022

Existing human genome assemblies have almost entirely excluded repetitive sequences within and near centromeres, limiting our understanding of their organization, evolution, and functions, which include facilitating proper chromosome segregation. Now, a complete, telomere-to-telomere human genome assembly (T2T-CHM13) has enabled us to comprehensively characterize pericentromeric and centromeric repeats, which constitute 6.2% of the genome (189.

View Article and Find Full Text PDF

StainedGlass: interactive visualization of massive tandem repeat structures with identity heatmaps.

Mitchell R Vollger Peter Kerpedjiev Adam M Phillippy Evan E Eichler

Bioinformatics

March 2022

Summary: The visualization and analysis of genomic repeats is typically accomplished using dot plots; however, the emergence of telomere-to-telomere assemblies with multi-megabase repeats requires new visualization strategies. Here, we introduce StainedGlass, which can generate publication-quality figures and interactive visualizations that depict the identity and orientation of multi-megabase tandem repeat structures at a genome-wide scale. The tool can rapidly reveal higher-order structures and improve the inference of evolutionary history for some of the most complex regions of genomes.

View Article and Find Full Text PDF

Evidence for opposing selective forces operating on human-specific duplicated TCAF genes in Neanderthals and humans.

PingHsun Hsieh Vy Dang Mitchell R Vollger Yafei Mao Tzu-Hsueh Huang

Nat Commun

August 2021

TRP channel-associated factor 1/2 (TCAF1/TCAF2) proteins antagonistically regulate the cold-sensor protein TRPM8 in multiple human tissues. Understanding their significance has been complicated given the locus spans a gap-ridden region with complex segmental duplications in GRCh38. Using long-read sequencing, we sequence-resolve the locus, annotate full-length TCAF models in primate genomes, and show substantial human-specific TCAF copy number variation.

View Article and Find Full Text PDF

The structure, function and evolution of a complete human chromosome 8.

Glennis A Logsdon Mitchell R Vollger PingHsun Hsieh Yafei Mao Mikhail A Liskovykh

Nature

May 2021

The complete assembly of each human chromosome is essential for understanding human biology and evolution. Here we use complementary long-read sequencing technologies to complete the linear assembly of human chromosome 8. Our assembly resolves the sequence of five previously long-standing gaps, including a 2.

View Article and Find Full Text PDF

Sequence diversity analyses of an improved rhesus macaque genome enhance its biomedical utility.

Wesley C Warren R Alan Harris Marina Haukness Ian T Fiddes Shwetha C Murali Mitchell R Vollger

Science

December 2020

The rhesus macaque () is the most widely studied nonhuman primate (NHP) in biomedical research. We present an updated reference genome assembly (Mmul_10, contig N50 = 46 Mbp) that increases the sequence contiguity 120-fold and annotate it using 6.5 million full-length transcripts, thus improving our understanding of gene content, isoform diversity, and repeat organization.

View Article and Find Full Text PDF

Fully phased human genome assembly without parental data using single-cell strand sequencing and long reads.

David Porubsky Peter Ebert Peter A Audano Mitchell R Vollger William T Harvey

Nat Biotechnol

March 2021

Article Synopsis

Human genomes are usually represented as consensus sequences, which don't include parental haplotype details.
The research presents a new method for creating a complete and phased diploid genome assembly using advanced sequencing techniques, specifically for an individual of Puerto Rican descent.
The resulting assemblies show high accuracy and continuity, yielding precise genetic variations while identifying common regions where genome breaks occur across different sequencing platforms.

View Article and Find Full Text PDF