Motivation: The efficient and reproducible analysis of high-throughput sequencing datasets necessitates the development of methodical and robust computational pipelines that integrate established and bespoke bioinformatics analysis tools, often written in high-level programming languages such as Python. Despite the increasing availability of programming libraries for genomics, there is a noticeable lack of tools specifically focused on transcriptomics. Key tasks in this area include the association of gene features (e.
View Article and Find Full Text PDFNucleotide conversion RNA sequencing techniques interrogate chemical RNA modifications in cellular transcripts, resulting in mismatch-containing reads. Biases in mapping the resulting reads to reference genomes remain poorly understood. We present splice_sim, a splice-aware RNA-seq simulation and evaluation pipeline that introduces user-defined nucleotide conversions at set frequencies, creates mixture models of converted and unconverted reads, and calculates mapping accuracies per genomic annotation.
View Article and Find Full Text PDFMany microRNAs (miRNAs) are expressed with high spatiotemporal specificity during organismal development, with some being limited to rare cell types, often embedded in complex tissues. Yet, most miRNA profiling efforts remain at the tissue and organ levels. To overcome challenges in accessing the microRNomes from tissue-embedded cells, we had previously developed mime-seq (miRNome by methylation-dependent sequencing), a technique in which cell-specific miRNA methylation in C.
View Article and Find Full Text PDFFor cells to perform their biological functions, they need to adopt specific shapes and form functionally distinct subcellular compartments. This is achieved in part via an asymmetric distribution of mRNAs within cells. Currently, the main model of mRNA localization involves specific sequences called "zipcodes" that direct mRNAs to their proper locations.
View Article and Find Full Text PDFPost-transcriptional gene regulation is a fundamental mechanism that helps regulate the development and healthy aging of the nervous system. Mutations that disrupt the function of RNA-binding proteins (RBPs), which regulate post-transcriptional gene regulation, have increasingly been implicated in neurological disorders including amyotrophic lateral sclerosis, Fragile X Syndrome, and spinal muscular atrophy. Interestingly, although the majority of RBPs are expressed widely within diverse tissue types, the nervous system is often particularly sensitive to their dysfunction.
View Article and Find Full Text PDFMost endogenous retroviruses (ERVs) in mammals are incapable of retrotransposition; therefore, why ERV derepression is associated with lethality during early development has been a mystery. Here, we report that rapid and selective degradation of the heterochromatin adapter protein TRIM28 triggers dissociation of transcriptional condensates from loci encoding super-enhancer (SE)-driven pluripotency genes and their association with transcribed ERV loci in murine embryonic stem cells. Knockdown of ERV RNAs or forced expression of SE-enriched transcription factors rescued condensate localization at SEs in TRIM28-degraded cells.
View Article and Find Full Text PDFNon-coding variants have long been recognized as important contributors to common disease risks, but with the expansion of clinical whole genome sequencing, examples of rare, high-impact non-coding variants are also accumulating. Despite recent advances in the study of regulatory elements and the availability of specialized data collections, the systematic annotation of non-coding variants from genome sequencing remains challenging. Here, we propose a new framework for the prioritization of non-coding regulatory variants that integrates information about regulatory regions with prediction scores and HPO-based prioritization.
View Article and Find Full Text PDFGene expression is controlled at multiple levels, including RNA transcription and turnover. But determining the relative contributions of RNA biogenesis and decay to the steady-state abundance of cellular transcripts remains challenging because conventional transcriptomics approaches do not provide the temporal resolution to derive the kinetic parameters underlying steady-state gene expression.Here, we describe a protocol that combines metabolic RNA labeling by 4-thiouridine with chemical nucleoside conversion and whole-transcriptome sequencing followed by bioinformatics analysis to determine RNA stability in cultured cells at a genomic scale.
View Article and Find Full Text PDFBioinformatics
December 2021
Motivation: Clinical decision making is increasingly guided by accurate and recurrent determination of presence and frequency of (somatic) variants and their haplotype through panel sequencing of disease-relevant genomic regions. Haplotype calling (phasing), however, is difficult and error prone unless variants are located on the same read which limits the ability of short-read sequencing to detect, e.g.
View Article and Find Full Text PDFThe transformation of chronic lymphocytic leukemia (CLL) to high-grade B-cell lymphoma is known as Richter syndrome (RS), a rare event with dismal prognosis. In this study, we conducted whole-genome sequencing (WGS) of paired circulating CLL (PB-CLL) and RS biopsies (tissue-RS) from 17 patients recruited into a clinical trial (CHOP-O). We found that tissue-RS was enriched for mutations in poor-risk CLL drivers and genes in the DNA damage response (DDR) pathway.
View Article and Find Full Text PDFAnn Clin Transl Neurol
November 2020
Objective: To assess class I-histone deacetylase (HDAC) inhibition on formation of lipid-accumulating, disease-promoting phagocytes upon myelin load in vitro, relevant for neuroinflammatory disorders like multiple sclerosis (MS) and cerebral X-linked adrenoleukodystrophy (X-ALD).
Methods: Immunohistochemistry on postmortem brain tissue of acute MS (n = 6) and cerebral ALD (n = 4) cases to analyze activation and foam cell state of phagocytes. RNA-Seq of in vitro differentiated healthy macrophages (n = 8) after sustained myelin-loading to assess the metabolic shift associated with foam cell formation.
Liquid biopsies as a minimally invasive approach have the potential to revolutionize molecular diagnostics. Yet, although protocols for sample handling and the isolation of circulating tumor DNA (ctDNA) are numerous, comprehensive guidelines for diagnostics and research considering all aspects of real-life multicenter clinical studies are currently not available. These include limitations in sample volume, transport, and blood collection tubes.
View Article and Find Full Text PDFIn animals, the most common type of RNA editing is the deamination of adenosines (A) into inosines (I). Because inosines basepair with cytosines (C), they are interpreted as guanosines (G) by the cellular machinery and genomically encoded G alleles at edited sites mimic the function of edited RNAs. The contribution of this hardwiring effect on genome evolution remains obscure.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
January 2020
The complement system is a crucial part of innate immune defenses against invading pathogens. The blood-meal of the tick lasts for days, and the tick must therefore rely on inhibitors to counter complement activation. We have identified a class of inhibitors from tick saliva, the CirpT family, and generated detailed structural data revealing their mechanism of action.
View Article and Find Full Text PDFPurpose: The translation of genome sequencing into routine health care has been slow, partly because of concerns about affordability. The aspirational cost of sequencing a genome is $1000, but there is little evidence to support this estimate. We estimate the cost of using genome sequencing in routine clinical care in patients with cancer or rare diseases.
View Article and Find Full Text PDFWhole-genome sequencing (WGS) is becoming widely used in clinical medicine in diagnostic contexts and to inform treatment choice. Here we evaluate the potential of the Oxford Nanopore Technologies (ONT) MinION long-read sequencer for routine WGS by sequencing the reference sample NA12878 and the genome of an individual with ataxia-pancytopenia syndrome and severe immune dysregulation. We develop and apply a novel reference panel-free analytical method to infer and then exploit phase information which improves single-nucleotide variant (SNV) calling performance from otherwise modest levels.
View Article and Find Full Text PDFRNA binding proteins (RBPs) mediate posttranscriptional gene regulatory events throughout development. During neurogenesis, many RBPs are required for proper dendrite morphogenesis within Drosophila sensory neurons. Despite their fundamental role in neuronal morphogenesis, little is known about the molecular mechanisms in which most RBPs participate during neurogenesis.
View Article and Find Full Text PDFSummary: Genomic interval sets are fundamental elements of genome annotation and are the output of countless bioinformatics applications. Nevertheless, tool support for the manual curation of these data is currently limited. We developed VARAN-GIE, an extension of the popular Integrative Genomics Viewer (IGV) that adds functionality to edit, annotate and merge genomic interval sets.
View Article and Find Full Text PDFX-linked adrenoleukodystrophy is caused by ATP-binding cassette transporter D1 (ABCD1) mutations and manifests by default as slowly progressive spinal cord axonopathy with associated demyelination (adrenomyloneuropathy). In 60% of male cases, however, X-linked adrenoleukodystrophy converts to devastating cerebral inflammation and demyelination (cerebral adrenoleukodystrophy) with infiltrating blood-derived monocytes and macrophages and cytotoxic T cells that can only be stopped by allogeneic haematopoietic stem cell transplantation or gene therapy at an early stage of the disease. Recently, we identified monocytes/macrophages but not T cells to be severely affected metabolically by ABCD1 deficiency.
View Article and Find Full Text PDFCold Spring Harb Mol Case Stud
April 2018
Next-generation sequencing (NGS) efforts have established catalogs of mutations relevant to cancer development. However, the clinical utility of this information remains largely unexplored. Here, we present the results of the first eight patients recruited into a clinical whole-genome sequencing (WGS) program in the United Kingdom.
View Article and Find Full Text PDFPurpose: Fresh-frozen (FF) tissue is the optimal source of DNA for whole-genome sequencing (WGS) of cancer patients. However, it is not always available, limiting the widespread application of WGS in clinical practice. We explored the viability of using formalin-fixed, paraffin-embedded (FFPE) tissues, available routinely for cancer patients, as a source of DNA for clinical WGS.
View Article and Find Full Text PDFIn search for RNA signals that modulate transcription via direct interaction with RNA polymerase (RNAP), we deep sequenced an E. coli genomic library enriched for RNAP-binding RNAs. Many natural RNAP-binding aptamers, termed RAPs, were mapped to the genome.
View Article and Find Full Text PDF