Publications by Melanie Kirsche

Publications by authors named "Melanie Kirsche"

Page 1 of 1

The EN-TEx resource of multi-tissue personal epigenomes & variant-impact models.

Joel Rozowsky Jiahao Gao Beatrice Borsari Yucheng T Yang Timur Galeev Melanie Kirsche

Cell

March 2023

Article Synopsis

* A deep-learning model can predict allele-specific activity using only local nucleotide sequences, emphasizing key transcription-factor-binding motifs affected by genetic variants.
* Combining EN-TEx with previous genome annotations shows significant connections between allele-specific loci and GWAS loci, and aids in transferring known eQTLs to challenging tissue types, improving personal functional genomics research.

View Article and Find Full Text PDF

Jasmine and Iris: population-scale structural variant comparison and analysis.

Melanie Kirsche Gautam Prabhu Rachel Sherman Bohan Ni Alexis Battle

Nat Methods

March 2023

The availability of long reads is revolutionizing studies of structural variants (SVs). However, because SVs vary across individuals and are discovered through imprecise read technologies and methods, they can be difficult to compare. Addressing this, we present Jasmine and Iris ( https://github.

View Article and Find Full Text PDF

Automated assembly scaffolding using RagTag elevates a new tomato system for high-throughput genome editing.

Michael Alonge Ludivine Lebeigle Melanie Kirsche Katie Jenike Shujun Ou

Genome Biol

December 2022

Advancing crop genomics requires efficient genetic systems enabled by high-quality personalized genome assemblies. Here, we introduce RagTag, a toolset for automating assembly scaffolding and patching, and we establish chromosome-scale reference genomes for the widely used tomato genotype M82 along with Sweet-100, a new rapid-cycling genotype that we developed to accelerate functional genomics and genome editing in tomato. This work outlines strategies to rapidly expand genetic systems and genomic resources in other plant species.

View Article and Find Full Text PDF

Benchmarking challenging small variants with linked and long reads.

Justin Wagner Nathan D Olson Lindsay Harris Ziad Khan Jesse Farek Melanie Kirsche

Cell Genom

May 2022

Genome in a Bottle benchmarks are widely used to help validate clinical sequencing pipelines and develop variant calling and sequencing methods. Here we use accurate linked and long reads to expand benchmarks in 7 samples to include difficult-to-map regions and segmental duplications that are challenging for short reads. These benchmarks add more than 300,000 SNVs and 50,000 insertions or deletions (indels) and include 16% more exonic variants, many in challenging, clinically relevant genes not covered previously, such as .

View Article and Find Full Text PDF

Semi-automated assembly of high-quality diploid human reference genomes.

Erich D Jarvis Giulio Formenti Arang Rhie Andrea Guarracino Chentao Yang Melanie Kirsche

Nature

November 2022

The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has benefitted society. However, it still has many gaps and errors, and does not represent a biological genome as it is a blend of multiple individuals. Recently, a high-quality telomere-to-telomere reference, CHM13, was generated with the latest long-read technologies, but it was derived from a hydatidiform mole cell line with a nearly homozygous genome.

View Article and Find Full Text PDF

A complete reference genome improves analysis of human genetic variation.

Sergey Aganezov Stephanie M Yan Daniela C Soto Melanie Kirsche Samantha Zarate

Science

April 2022

Compared to its predecessors, the Telomere-to-Telomere CHM13 genome adds nearly 200 million base pairs of sequence, corrects thousands of structural errors, and unlocks the most complex regions of the human genome for clinical and functional study. We show how this reference universally improves read mapping and variant calling for 3202 and 17 globally diverse samples sequenced with short and long reads, respectively. We identify hundreds of thousands of variants per sample in previously unresolved regions, showcasing the promise of the T2T-CHM13 reference for evolutionary and biomedical discovery.

View Article and Find Full Text PDF

The complete sequence of a human genome.

Sergey Nurk Sergey Koren Arang Rhie Mikko Rautiainen Andrey V Bzikadze Melanie Kirsche

Science

April 2022

Since its initial release in 2000, the human reference genome has covered only the euchromatic fraction of the genome, leaving important heterochromatic regions unfinished. Addressing the remaining 8% of the genome, the Telomere-to-Telomere (T2T) Consortium presents a complete 3.055 billion-base pair sequence of a human genome, T2T-CHM13, that includes gapless assemblies for all chromosomes except Y, corrects errors in the prior references, and introduces nearly 200 million base pairs of sequence containing 1956 gene predictions, 99 of which are predicted to be protein coding.

View Article and Find Full Text PDF

Democratizing long-read genome assembly.

Melanie Kirsche Michael C Schatz

Cell Syst

October 2021

De novo assembled genomes serve as the backbone for modern genomics. In an article in this issue of Cell Systems, Ekim et al. present the mdBG assembler that can assemble genomes 100-fold faster than previous methods, including a human genome in under 10 min, which unlocks pan-genomics for many species.

View Article and Find Full Text PDF

Genomic diversity of SARS-CoV-2 during early introduction into the Baltimore-Washington metropolitan area.

Peter M Thielen Shirlee Wohl Thomas Mehoke Srividya Ramakrishnan Melanie Kirsche

JCI Insight

March 2021

The early COVID-19 pandemic was characterized by rapid global spread. In Maryland and Washington, DC, United States, more than 2500 cases were reported within 3 weeks of the first COVID-19 detection in March 2020. We aimed to use genomic sequencing to understand the initial spread of SARS-CoV-2 - the virus that causes COVID-19 - in the region.

View Article and Find Full Text PDF

Sapling: accelerating suffix array queries with learned data models.

Melanie Kirsche Arun Das Michael C Schatz

Bioinformatics

May 2021

Motivation: As genomic data becomes more abundant, efficient algorithms and data structures for sequence alignment become increasingly important. The suffix array is a widely used data structure to accelerate alignment, but the binary search algorithm used to query, it requires widespread memory accesses, causing a large number of cache misses on large datasets.

Results: Here, we present Sapling, an algorithm for sequence alignment, which uses a learned data model to augment the suffix array and enable faster queries.

View Article and Find Full Text PDF

A diploid assembly-based benchmark for variants in the major histocompatibility complex.

Chen-Shan Chin Justin Wagner Qiandong Zeng Erik Garrison Shilpa Garg Melanie Kirsche

Nat Commun

September 2020

Most human genomes are characterized by aligning individual reads to the reference genome, but accurate long reads and linked reads now enable us to construct accurate, phased de novo assemblies. We focus on a medically important, highly variable, 5 million base-pair (bp) region where diploid assembly is particularly useful - the Major Histocompatibility Complex (MHC). Here, we develop a human genome benchmark derived from a diploid assembly for the openly-consented Genome in a Bottle sample HG002.

View Article and Find Full Text PDF

Comprehensive analysis of structural variants in breast cancer genomes using single-molecule sequencing.

Sergey Aganezov Sara Goodwin Rachel M Sherman Fritz J Sedlazeck Gayatri Arun Melanie Kirsche

Genome Res

September 2020

Improved identification of structural variants (SVs) in cancer can lead to more targeted and effective treatment options as well as advance our basic understanding of the disease and its progression. We performed whole-genome sequencing of the SKBR3 breast cancer cell line and patient-derived tumor and normal organoids from two breast cancer patients using Illumina/10x Genomics, Pacific Biosciences (PacBio), and Oxford Nanopore Technologies (ONT) sequencing. We then inferred SVs and large-scale allele-specific copy number variants (CNVs) using an ensemble of methods.

View Article and Find Full Text PDF

Genomic Diversity of SARS-CoV-2 During Early Introduction into the United States National Capital Region.

Peter M Thielen Shirlee Wohl Thomas Mehoke Srividya Ramakrishnan Melanie Kirsche

medRxiv

August 2020

Background: The early COVID-19 pandemic has been characterized by rapid global spread. In the United States National Capital Region, over 2,000 cases were reported within three weeks of its first detection in March 2020. We aimed to use genomic sequencing to understand the initial spread of SARS-CoV-2, the virus that causes COVID-19, in the region.

View Article and Find Full Text PDF

Major Impacts of Widespread Structural Variation on Gene Expression and Crop Improvement in Tomato.

Michael Alonge Xingang Wang Matthias Benoit Sebastian Soyk Lara Pereira Melanie Kirsche

Cell

July 2020

Structural variants (SVs) underlie important crop improvement and domestication traits. However, resolving the extent, diversity, and quantitative impact of SVs has been challenging. We used long-read nanopore sequencing to capture 238,490 SVs in 100 diverse tomato lines.

View Article and Find Full Text PDF

Paragraph: a graph-based structural variant genotyper for short-read sequence data.

Sai Chen Peter Krusche Egor Dolzhenko Rachel M Sherman Roman Petrovski Melanie Kirsche

Genome Biol

December 2019

Accurate detection and genotyping of structural variations (SVs) from short-read data is a long-standing area of development in genomics research and clinical sequencing pipelines. We introduce Paragraph, an accurate genotyper that models SVs using sequence graphs and SV annotations. We demonstrate the accuracy of Paragraph on whole-genome sequence data from three samples using long-read SV calls as the truth set, and then apply Paragraph at scale to a cohort of 100 short-read sequenced samples of diverse ancestry.

View Article and Find Full Text PDF