AIDE: annotation-assisted isoform discovery with high precision.

Genome Res

Department of Statistics, University of California, Los Angeles, California 90095, USA.

Published: December 2019

Genome-wide accurate identification and quantification of full-length mRNA isoforms is crucial for investigating transcriptional and posttranscriptional regulatory mechanisms of biological phenomena. Despite continuing efforts in developing effective computational tools to identify or assemble full-length mRNA isoforms from second-generation RNA-seq data, it remains a challenge to accurately identify mRNA isoforms from short sequence reads owing to the substantial information loss in RNA-seq experiments. Here, we introduce a novel statistical method, annotation-assisted isoform discovery (AIDE), the first approach that directly controls false isoform discoveries by implementing the testing-based model selection principle. Solving the isoform discovery problem in a stepwise and conservative manner, AIDE prioritizes the annotated isoforms and precisely identifies novel isoforms whose addition significantly improves the explanation of observed RNA-seq reads. We evaluate the performance of AIDE based on multiple simulated and real RNA-seq data sets followed by PCR-Sanger sequencing validation. Our results show that AIDE effectively leverages the annotation information to compensate the information loss owing to short read lengths. AIDE achieves the highest precision in isoform discovery and the lowest error rates in isoform abundance estimation, compared with three state-of-the-art methods Cufflinks, SLIDE, and StringTie. As a robust bioinformatics tool for transcriptome analysis, AIDE enables researchers to discover novel transcripts with high confidence.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6886511PMC
http://dx.doi.org/10.1101/gr.251108.119DOI Listing

Publication Analysis

Top Keywords

isoform discovery
16
mrna isoforms
12
annotation-assisted isoform
8
full-length mrna
8
rna-seq data
8
aide
7
isoform
6
isoforms
5
aide annotation-assisted
4
discovery
4

Similar Publications

Prion Protein Endoproteolysis: Cleavage Sites, Mechanisms and Connections to Prion Disease.

J Neurochem

January 2025

Centre for Prions and Protein Folding Diseases, University of Alberta, Edmonton, Canada.

Highly abundant in neurons, the cellular prion protein (PrP) is an obligatory precursor to the disease-associated misfolded isoform denoted PrP that accumulates in the rare neurodegenerative disorders referred to either as transmissible spongiform encephalopathies (TSEs) or as prion diseases. The ability of PrP to serve as a substrate for this template-mediated conversion process depends on several criteria but importantly includes the presence or absence of certain endoproteolytic events performed at the cell surface or in acidic endolysosomal compartments. The major endoproteolytic events affecting PrP are referred to as α- and β-cleavages, and in this review we outline the sites within PrP at which the cleavages occur, the mechanisms potentially responsible and their relevance to pathology.

View Article and Find Full Text PDF

Metabolic and insulin-resistant diseases, such as type 2 diabetes mellitus (T2DM), have become major health issues worldwide. The prevalence of insulin resistance in the general population ranges from 15.5% to 44.

View Article and Find Full Text PDF

The Ca 3.2 isoform of T-type voltage-gated calcium channels plays a crucial role in regulating the excitability of nociceptive neurons; the endogenous molecules that modulate its activity, however, remain poorly understood. Here, we used serum proteomics and patch-clamp physiology to discover a novel peptide albumin (1-26) that facilitates channel gating by chelating trace metals that tonically inhibit Ca 3.

View Article and Find Full Text PDF

Background/objectives: All 11 metallothionein protein-coding genes are located on human chromosome 16q13. It is unique among human genetics to have an entire pathway's genes clustered in a short chromosomal region. Since solid tumors, particularly high-grade serous ovarian cancer (HGSC), exhibit high rates of monoallelic aneuploidy, this region is commonly lost.

View Article and Find Full Text PDF

Characterization of gene-environment interactions for vitamin D through variance quantitative trait loci: a UK Biobank-based genetic epidemiology study.

Am J Clin Nutr

January 2025

Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada; Division of Epidemiology, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada; Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada.

Background: Understanding gene-environment interactions associated with vitamin D status may refine nutrition and public health strategies for vitamin D deficiency. Recent methodological advances have enabled the identification of variance quantitative trait loci (vQTLs) where gene-environment interactions are enriched.

Objectives: To identify vQTLs for serum 25-hydroxy vitamin D (25OHD) concentrations and characterize potential gene-environment interactions of vQTLs.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!