Publications by authors named "Kiran V Garimella"

Article Synopsis
  • Researchers tackled the challenge of studying structural variants (SVs) in repetitive genomic regions using advanced technologies like long-read sequencing and the gapless T2T assembly.
  • They successfully analyzed 13 complex cases, resolving 10 by identifying specific genomic breakpoints and structures that were previously difficult to sequence, including Robertsonian translocations and ring chromosomes.
  • The study highlighted new mechanisms for SV formation and provided insights into how these genome variations affect gene expression and potential implications for disease diagnosis and genome biology.
View Article and Find Full Text PDF

encodes a human long noncoding RNA (lncRNA) adjacent to , a coding gene in which de novo loss-of-function variants cause developmental and epileptic encephalopathy. Here, we report our findings in three unrelated children with a syndromic, early-onset neurodevelopmental disorder, each of whom had a de novo deletion in the locus. The children had severe encephalopathy, shared facial dysmorphisms, cortical atrophy, and cerebral hypomyelination - a phenotype that is distinct from the phenotypes of patients with haploinsufficiency.

View Article and Find Full Text PDF
Article Synopsis
  • Long non-coding RNAs (lncRNAs) make up a significant part of the human genome, but findings show that a specific lncRNA, located near a coding gene, is linked to severe developmental disorders and epilepsy through harmful mutations.
  • Researchers found three individuals with a rare deletion affecting this lncRNA, displaying similar symptoms such as developmental delays and distinct facial features, differing from typical haploinsufficiency effects.
  • The study revealed that this deletion leads to altered mRNA and protein levels in patients, demonstrating that structural variants can cause neurodevelopmental disorders and emphasizing the importance of further evaluating lncRNAs in relation to genetic diseases.
View Article and Find Full Text PDF

Single-cell transcriptomics has become the definitive method for classifying cell types and states, and can be augmented with genotype information to improve cell lineage identification. Due to constraints of short-read sequencing, current methods to detect natural genetic barcodes often require cumbersome primer panels and early commitment to targets. Here we devise a flexible long-read sequencing workflow and analysis pipeline, termed nanoranger, that starts from intermediate single-cell cDNA libraries to detect cell lineage-defining features, including single-nucleotide variants, fusion genes, isoforms, sequences of chimeric antigen and TCRs.

View Article and Find Full Text PDF
Article Synopsis
  • Long-read RNA-sequencing methods can capture full transcript isoforms but traditionally have low throughput*. -
  • The new technique, multiplexed arrays isoform sequencing (MAS-ISO-seq), enhances this by combining cDNAs for more efficient long-read sequencing, boosting throughput by over 15 times*. -
  • In experiments with tumor-infiltrating T cells, MAS-ISO-seq led to a significant increase (12- to 32-fold) in the identification of differentially spliced genes*.
View Article and Find Full Text PDF

The brown bear (Ursus arctos) is the second largest and most widespread extant terrestrial carnivore on Earth and has recently emerged as a medical model for human metabolic diseases. Here, we report a fully phased chromosome-level assembly of a male North American brown bear built by combining Pacific Biosciences (PacBio) HiFi data and publicly available Hi-C data. The final genome size is 2.

View Article and Find Full Text PDF

The characterization of de novo mutations in regions of high sequence and structural diversity from whole-genome sequencing data remains highly challenging. Complex structural variants tend to arise in regions of high repetitiveness and low complexity, challenging both de novo assembly, in which short reads do not capture the long-range context required for resolution, and mapping approaches, in which improper alignment of reads to a reference genome that is highly diverged from that of the sample can lead to false or partial calls. Long-read technologies can potentially solve such problems but are currently unfeasible to use at scale.

View Article and Find Full Text PDF

Motivation: The de Bruijn graph is a simple and efficient data structure that is used in many areas of sequence analysis including genome assembly, read error correction and variant calling. The data structure has a single parameter k, is straightforward to implement and is tractable for large genomes with high sequencing depth. It also enables representation of multiple samples simultaneously to facilitate comparison.

View Article and Find Full Text PDF

Germline mutation detection from human DNA sequence data is challenging due to the rarity of such events relative to the intrinsic error rates of sequencing technologies and the uneven coverage across the genome. We developed PhaseByTransmission (PBT) to identify de novo single nucleotide variants and short insertions and deletions (indels) from sequence data collected in parent-offspring trios. We compute the joint probability of the data given the genotype likelihoods in the individual family members, the known familial relationships and a prior probability for the mutation rate.

View Article and Find Full Text PDF

This unit describes how to use BWA and the Genome Analysis Toolkit (GATK) to map genome sequencing data to a reference and produce high-quality variant calls that can be used in downstream analyses. The complete workflow includes the core NGS data processing steps that are necessary to make the raw data suitable for analysis by the GATK, as well as the key methods involved in variant discovery using the GATK.

View Article and Find Full Text PDF

The translation of "next-generation" sequencing directly to the clinic is still being assessed but has the potential for genetic diseases to reduce costs, advance accuracy, and point to unsuspected yet treatable conditions. To study its capability in the clinic, we performed whole-exome sequencing in 118 probands with a diagnosis of a pediatric-onset neurodevelopmental disease in which most known causes had been excluded. Twenty-two genes not previously identified as disease-causing were identified in this study (19% of cohort), further establishing exome sequencing as a useful tool for gene discovery.

View Article and Find Full Text PDF

Recent advances in sequencing technology make it possible to comprehensively catalog genetic variation in population samples, creating a foundation for understanding human disease, ancestry and evolution. The amounts of raw data produced are prodigious, and many computational steps are required to translate this output into high-quality variant calls. We present a unified analytic framework to discover and genotype variation among multiple samples simultaneously that achieves sensitive and specific results across five sequencing technologies and three distinct, canonical experimental designs.

View Article and Find Full Text PDF

We sequenced all protein-coding regions of the genome (the "exome") in two family members with combined hypolipidemia, marked by extremely low plasma levels of low-density lipoprotein (LDL) cholesterol, high-density lipoprotein (HDL) cholesterol, and triglycerides. These two participants were compound heterozygotes for two distinct nonsense mutations in ANGPTL3 (encoding the angiopoietin-like 3 protein). ANGPTL3 has been reported to inhibit lipoprotein lipase and endothelial lipase, thereby increasing plasma triglyceride and HDL cholesterol levels in rodents.

View Article and Find Full Text PDF