Publications by Mark J Chaisson

Publications by authors named "Mark J Chaisson"

Page 1 of 1

Genotyping sequence-resolved copy-number variations using pangenomes reveals paralog-specific global diversity and expression divergence of duplicated genes.

bioRxiv

October 2024

Copy-number variable (CNV) genes are important in evolution and disease, yet sequence variation in CNV genes is a blindspot for large-scale studies. We present a method, ctyper, that leverages pangenomes to produce copy-number maps with allele-specific sequences containing locally phased variants of CNV genes from NGS reads. We extensively characterized accuracy and efficiency on a database of 3,351 CNV genes including , , and as well as 212 non-CNV medically-relevant challenging genes.

View Article and Find Full Text PDF

Structural variation across 138,134 samples in the TOPMed consortium.

Goo Jun Adam C English Ginger A Metcalf Jianzhi Yang Mark Jp Chaisson

Res Sq

February 2023

Article Synopsis

Researchers compiled a comprehensive catalog of 355,667 structural variants (SVs) from DNA data, with over half being novel, to better understand the relationship between SVs and diseases.
The study involved rigorous methods to ensure high-quality variant identification, showing over 90% accuracy compared to previous genetic assemblies.
This catalog reveals significant connections between SVs and various health traits, identifying 690 specific regions that may influence medically relevant genes, providing a crucial resource for disease research.

View Article and Find Full Text PDF

Structural variation across 138,134 samples in the TOPMed consortium.

Goo Jun Adam C English Ginger A Metcalf Jianzhi Yang Mark Jp Chaisson

bioRxiv

January 2023

Ever larger Structural Variant (SV) catalogs highlighting the diversity within and between populations help researchers better understand the links between SVs and disease. The identification of SVs from DNA sequence data is non-trivial and requires a balance between comprehensiveness and precision. Here we present a catalog of 355,667 SVs (59.

View Article and Find Full Text PDF

Deleterious mitochondrial DNA point mutations are overrepresented in Drosophila expressing a proofreading-defective DNA polymerase γ.

Colby L Samstag Jake G Hoekstra Chiu-Hui Huang Mark J Chaisson Richard J Youle

PLoS Genet

November 2018

Mitochondrial DNA (mtDNA) mutations cause severe maternally inherited syndromes and the accumulation of somatic mtDNA mutations is implicated in aging and common diseases. However, the mechanisms that influence the frequency and pathogenicity of mtDNA mutations are poorly understood. To address this matter, we created a Drosophila mtDNA mutator strain expressing a proofreading-deficient form of the mitochondrial DNA polymerase.

View Article and Find Full Text PDF

Resolving multicopy duplications using polyploid phasing.

Mark J Chaisson Sudipto Mukherjee Sreeram Kannan Evan E Eichler

Res Comput Mol Biol

May 2017

While the rise of single-molecule sequencing systems has enabled an unprecedented rise in the ability to assemble complex regions of the genome, long segmental duplications in the genome still remain a challenging frontier in assembly. Segmental duplications are at the same time both gene rich and prone to large structural rearrangements, making the resolution of their sequences important in medical and evolutionary studies. Duplicated sequences that are collapsed in mammalian assemblies are rarely identical; after a sequence is duplicated, it begins to acquire .

View Article and Find Full Text PDF

Long-read sequence assembly of the gorilla genome.

David Gordon John Huddleston Mark J P Chaisson Christopher M Hill Zev N Kronenberg

Science

April 2016

Accurate sequence and assembly of genomes is a critical first step for studies of genetic variation. We generated a high-quality assembly of the gorilla genome using single-molecule, real-time sequence technology and a string graph de novo assembly algorithm. The new assembly improves contiguity by two to three orders of magnitude with respect to previously released assemblies, recovering 87% of missing reference exons and incomplete gene models.

View Article and Find Full Text PDF

Genetic variation and the de novo assembly of human genomes.

Mark J P Chaisson Richard K Wilson Evan E Eichler

Nat Rev Genet

November 2015

The discovery of genetic variation and the assembly of genome sequences are both inextricably linked to advances in DNA-sequencing technology. Short-read massively parallel sequencing has revolutionized our ability to discover genetic variation but is insufficient to generate high-quality genome assemblies or resolve most structural variation. Full resolution of variation is only guaranteed by complete de novo assembly of a genome.

View Article and Find Full Text PDF

Resolving the complexity of the human genome using single-molecule sequencing.

Mark J P Chaisson John Huddleston Megan Y Dennis Peter H Sudmant Maika Malig

Nature

January 2015

The human genome is arguably the most complete mammalian reference assembly, yet more than 160 euchromatic gaps remain and aspects of its structural variation remain poorly understood ten years after its completion. To identify missing sequence and genetic variation, here we sequence and analyse a haploid human genome (CHM1) using single-molecule, real-time DNA sequencing. We close or extend 55% of the remaining interstitial gaps in the human GRCh37 reference genome--78% of which carried long runs of degenerate short tandem repeats, often several kilobases in length, embedded within (G+C)-rich genomic regions.

View Article and Find Full Text PDF

Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory.

Mark J Chaisson Glenn Tesler

BMC Bioinformatics

September 2012

Background: Recent methods have been developed to perform high-throughput sequencing of DNA by Single Molecule Sequencing (SMS). While Next-Generation sequencing methods may produce reads up to several hundred bases long, SMS sequencing produces reads up to tens of kilobases long. Existing alignment methods are either too inefficient for high-throughput datasets, or not sensitive enough to align SMS reads, which have a higher error rate than Next-Generation sequencing.

View Article and Find Full Text PDF

De novo fragment assembly with short mate-paired reads: Does the read length matter?

Mark J Chaisson Dumitru Brinza Pavel A Pevzner

Genome Res

February 2009

Increasing read length is currently viewed as the crucial condition for fragment assembly with next-generation sequencing technologies. However, introducing mate-paired reads (separated by a gap of length, GapLength) opens a possibility to transform short mate-pairs into long mate-reads of length approximately GapLength, and thus raises the question as to whether the read length (as opposed to GapLength) even matters. We describe a new tool, EULER-USR, for assembling mate-paired short reads and use it to analyze the question of whether the read length matters.

View Article and Find Full Text PDF

Short read fragment assembly of bacterial genomes.

Mark J Chaisson Pavel A Pevzner

Genome Res

February 2008

In the last year, high-throughput sequencing technologies have progressed from proof-of-concept to production quality. While these methods produce high-quality reads, they have yet to produce reads comparable in length to Sanger-based sequencing. Current fragment assembly algorithms have been implemented and optimized for mate-paired Sanger-based reads, and thus do not perform well on short reads produced by short read technologies.

View Article and Find Full Text PDF

Publications by authors named "Mark J Chaisson"

Genotyping sequence-resolved copy-number variations using pangenomes reveals paralog-specific global diversity and expression divergence of duplicated genes.

Structural variation across 138,134 samples in the TOPMed consortium.

Article Synopsis

Structural variation across 138,134 samples in the TOPMed consortium.

Deleterious mitochondrial DNA point mutations are overrepresented in Drosophila expressing a proofreading-defective DNA polymerase γ.

Resolving multicopy duplications using polyploid phasing.

Long-read sequence assembly of the gorilla genome.

Genetic variation and the de novo assembly of human genomes.

Resolving the complexity of the human genome using single-molecule sequencing.

Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory.

De novo fragment assembly with short mate-paired reads: Does the read length matter?

Short read fragment assembly of bacterial genomes.

A PHP Error was encountered

A PHP Error was encountered