Augmenting Transcriptome Annotations through the Lens of Splicing Evolution.

bioRxiv

Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA.

Published: November 2024

Alternative splicing (AS) is a ubiquitous mechanism in eukaryotes. It is estimated that 90% of human genes are alternatively spliced. Despite enormous efforts, transcriptome annotations remain, nevertheless, incomplete. Conventional means of annotation were largely driven by experimental data such as RNA-seq and protein sequences, while little insight was shed on understanding transcriptomes and alternative splicings from the perspective of evolution. This study addresses this critical gap by presenting TENNIS (Transcript EvolutioN for New Isoform Splicing), an evolution-based model to predict unannotated isoforms and refine existing annotations without requiring additional data. The model of TENNIS is based on two minimal premises-AS isoforms evolve sequentially from existing isoforms, and each evolutionary step involves a single AS event. We formulate the identification of missing transcripts as an optimization problem and parsimoniously find the minimal number of novel transcripts. Our analysis showed approximately 80% of multi-transcript groups from six transcriptome annotations satisfy our evolutionary model. At a high confidence level, 40% of isoforms predicted by TENNIS were validated by deep long-read RNA-seq. In a simulated incomplete annotation scenario, TENNIS dramatically outperforms two randomized baseline approaches by a 2.25-3 fold-change in precision or a 3.5-3.9 fold-change in recall, after controlling the same level of recall or precision of the baseline methods. These results demonstrate that TENNIS effectively identifies missing transcripts by complying with minimal propositions, offering a powerful approach for transcriptome augmentations through the lens of alternative splicing evolutions. TENNIS is freely available at https://github.com/Shao-Group/tennis.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11580973PMC
http://dx.doi.org/10.1101/2024.11.04.621892DOI Listing

Publication Analysis

Top Keywords

transcriptome annotations
12
alternative splicing
8
missing transcripts
8
tennis
6
augmenting transcriptome
4
annotations
4
annotations lens
4
splicing
4
lens splicing
4
splicing evolution
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!