Background: The gene C9orf72 harbors a non-coding hexanucleotide repeat expansion known to cause amyotrophic lateral sclerosis and frontotemporal dementia. While previous studies have estimated the length of this repeat expansion in multiple tissues, technological limitations have impeded researchers from exploring additional features, such as methylation levels.
Methods: We aimed to characterize C9orf72 repeat expansions using a targeted, amplification-free long-read sequencing method.
Motivation: Alternative splicing generates multiple RNA isoforms from a single gene, enriching genetic diversity and impacting gene function. Effective visualization of these isoforms and their expression patterns is crucial but challenging due to limitations in existing tools. Traditional genome browsers lack programmability, while other tools offer limited customization, produce static plots, or cannot simultaneously display structures and expression levels.
View Article and Find Full Text PDFDetermining whether the RNA isoforms from medically relevant genes have distinct functions could facilitate direct targeting of RNA isoforms for disease treatment. Here, as a step toward this goal for neurological diseases, we sequenced 12 postmortem, aged human frontal cortices (6 Alzheimer disease cases and 6 controls; 50% female) using one Oxford Nanopore PromethION flow cell per sample. We identified 1,917 medically relevant genes expressing multiple isoforms in the frontal cortex where 1,018 had multiple isoforms with different protein-coding sequences.
View Article and Find Full Text PDFEven though alternative RNA splicing was discovered nearly 50 years ago (1977), we still understand very little about most isoforms arising from a single gene, including in which tissues they are expressed and if their functions differ. Human gene annotations suggest remarkable transcriptional complexity, with approximately 252,798 distinct RNA isoform annotations from 62,710 gene bodies (Ensembl v109; 2023), emphasizing the need to understand their biological effects. For example, 256 gene bodies have ≥50 annotated isoforms and 30 have ≥100, where one protein-coding gene () even has 192 distinct RNA isoform annotations.
View Article and Find Full Text PDFDue to alternative splicing, human protein-coding genes average over eight RNA isoforms, resulting in nearly four distinct protein coding sequences per gene. Long-read RNAseq (IsoSeq) enables more accurate quantification of isoforms, shedding light on their specific roles. To assess the medical relevance of measuring RNA isoform expression, we sequenced 12 aged human frontal cortices (6 Alzheimer's disease cases and 6 controls; 50% female) using one Oxford Nanopore PromethION flow cell per sample.
View Article and Find Full Text PDFThe process of identifying suitable genome-wide association (GWA) studies and formatting the data to calculate multiple polygenic risk scores on a single genome can be laborious. Here, we present a centralized polygenic risk score calculator currently containing over 250,000 genetic variant associations from the NHGRI-EBI GWAS Catalog for users to easily calculate sample-specific polygenic risk scores with comparable results to other available tools. Polygenic risk scores are calculated either online through the Polygenic Risk Score Knowledge Base (PRSKB; https://prs.
View Article and Find Full Text PDFProtein-protein functional interactions arise from either transitory or permanent biomolecular associations and often lead to the coevolution of the interacting residues. Although mutual information has traditionally been used to identify coevolving residues within the same protein, its application between coevolving proteins remains largely uncharacterized. Therefore, we developed the Protein Interactions Calculator (PIC) to efficiently identify coevolving residues between two protein sequences using mutual information.
View Article and Find Full Text PDFAncestral RNA polymerase III (Pol III) is a multi-subunit polymerase responsible for transcription of short non-coding RNA, such as double-stranded short interspersed nuclear elements (SINEs). Although SINE ncRNAs are generally transcriptionally repressed, they can be induced in response to viral infections and can stimulate immune signaling pathways. Indeed, mutations in RNA Pol III have been associated with poor antiviral interferon response following infection with varicella zoster virus (VZV).
View Article and Find Full Text PDFRamp sequences occur when the average translational efficiency of codons near the 5' end of highly expressed genes is significantly lower than the rest of the gene sequence, which counterintuitively increases translational efficiency by decreasing downstream ribosomal collisions. Here, we show that the relative codon adaptiveness within different tissues changes the existence of a ramp sequence without altering the underlying genetic code. We present the first comprehensive analysis of tissue and cell type-specific ramp sequences and report 3108 genes with ramp sequences that change between tissues and cell types, which corresponds with increased gene expression within those tissues and cells.
View Article and Find Full Text PDFBackground: Microglia, the resident immune cells of the brain, play a critical role in numerous diseases, but are a minority cell type and difficult to genetically manipulate in vivo with viral vectors and other approaches. Primary cultures allow a more controlled setting to investigate these cells, but morphological and transcriptional changes upon removal from their normal brain environment raise many caveats from in vitro studies.
Methods: To investigate whether cultured microglia recapitulate in vivo microglial signatures, we used single-cell RNA sequencing (scRNAseq) to compare microglia freshly isolated from the brain to primary microglial cultures.
The repetitive nature and complexity of some medically relevant genes poses a challenge for their accurate analysis in a clinical setting. The Genome in a Bottle Consortium has provided variant benchmark sets, but these exclude nearly 400 medically relevant genes due to their repetitiveness or polymorphic complexity. Here, we characterize 273 of these 395 challenging autosomal genes using a haplotype-resolved whole-genome assembly.
View Article and Find Full Text PDFThe Alzheimer's Disease Neuroimaging Initiative (ADNI) contains extensive patient measurements (e.g., magnetic resonance imaging [MRI], biometrics, RNA expression, etc.
View Article and Find Full Text PDFPolygenic scores (or genetic risk scores) quantify the aggregate of small effects from many common genetic loci that have been associated with a trait through genome-wide association. Polygenic scores were first used successfully in schizophrenia and have since been applied to multiple phenotypes including multiple sclerosis, rheumatoid arthritis, and height. Because human height is an easily-measured and complex polygenic trait, polygenic height scores provide exciting insights into the predictability of aggregate common variant effect on the phenotype.
View Article and Find Full Text PDFBackground: Many neurodegenerative diseases are caused by nucleotide repeat expansions, but most expansions, like the C9orf72 'GGGGCC' (GC) repeat that causes approximately 5-7% of all amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD) cases, are too long to sequence using short-read sequencing technologies. It is unclear whether long-read sequencing technologies can traverse these long, challenging repeat expansions. Here, we demonstrate that two long-read sequencing technologies, Pacific Biosciences' (PacBio) and Oxford Nanopore Technologies' (ONT), can sequence through disease-causing repeats cloned into plasmids, including the FTD/ALS-causing GC repeat expansion.
View Article and Find Full Text PDFAlzheimer's disease (AD) is an age-associated neurodegenerative disease characterized by amyloidosis, tauopathy, and activation of microglia, the brain resident innate immune cells. We show that a RiboTag translational profiling approach can bypass biases due to cellular enrichment/cell sorting. Using this approach in models of amyloidosis, tauopathy, and aging, we revealed a common set of alterations and identified a central APOE-driven network that converged on CCL3 and CCL4 across all conditions.
View Article and Find Full Text PDFThe major genetic cause of frontotemporal dementia (FTD) and amyotrophic lateral sclerosis (ALS) is a C9orf72 GC repeat expansion. Proposed mechanisms by which the expansion causes c9FTD/ALS include toxicity from repeat-containing RNA and from dipeptide repeat proteins translated from these transcripts. To investigate the contribution of poly(GR) dipeptide repeat proteins to c9FTD/ALS pathogenesis in a mammalian in vivo model, we generated mice that expressed GFP-(GR) in the brain.
View Article and Find Full Text PDFCorticobasal degeneration (CBD) is a clinically heterogeneous tauopathy, which has overlapping clinicopathologic and genetic characteristics with progressive supranuclear palsy (PSP). This study aimed to elucidate whether transactive response DNA-binding protein of 43 kDa (TDP-43) pathology contributes to clinicopathologic heterogeneity of CBD. Paraffin-embedded sections of the midbrain, pons, subthalamic nucleus, and basal forebrain from 187 autopsy-confirmed CBD cases were screened with immunohistochemistry for phospho-TDP-43.
View Article and Find Full Text PDFAmyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD) are two devastating and lethal neurodegenerative diseases seen comorbidly in up to 15% of patients. Despite several decades of research, no effective treatment or disease-modifying strategies have been developed. We now understand more than before about the genetics and biology behind ALS and FTD, but the genetic etiology for the majority of patients is still unknown and the phenotypic variability observed across patients, even those carrying the same mutation, is enigmatic.
View Article and Find Full Text PDFBackground: While age and the APOE ε4 allele are major risk factors for Alzheimer's disease (AD), a small percentage of individuals with these risk factors exhibit AD resilience by living well beyond 75 years of age without any clinical symptoms of cognitive decline.
Methods: We used over 200 "AD resilient" individuals and an innovative, pedigree-based approach to identify genetic variants that segregate with AD resilience. First, we performed linkage analyses in pedigrees with resilient individuals and a statistical excess of AD deaths.
We previously found C9orf72-associated (c9ALS) and sporadic amyotrophic lateral sclerosis (sALS) brain transcriptomes comprise thousands of defects, among which, some are likely key contributors to ALS pathogenesis. We have now generated complementary methylome data and combine these two data sets to perform a comprehensive "multi-omic" analysis to clarify the molecular mechanisms initiating RNA misregulation in ALS. We found that c9ALS and sALS patients have generally distinct but overlapping methylome profiles, and that the c9ALS- and sALS-affected genes and pathways have similar biological functions, indicating conserved pathobiology in disease.
View Article and Find Full Text PDF