Designing effective and accurate tools for identifying the functional and structural elements in a genome remains at the frontier of genome annotation owing to incompleteness and inaccuracy of the data, limitations in the computational models, and shifting paradigms in genomics, such as alternative splicing. We present a methodology for the automated annotation of genes and their alternatively spliced mRNA transcripts based on existing cDNA and protein sequence evidence from the same species or projected from a related species using syntenic mapping information. At the core of the method is the splice graph, a compact representation of a gene, its exons, introns, and alternatively spliced isoforms. The putative transcripts are enumerated from the graph and assigned confidence scores based on the strength of sequence evidence, and a subset of the high-scoring candidates are selected and promoted into the annotation. The method is highly selective, eliminating the unlikely candidates while retaining 98% of the high-quality mRNA evidence in well-formed transcripts, and produces annotation that is measurably more accurate than some evidence-based gene sets. The process is fast, accurate, and fully automated, and combines the traditionally distinct gene annotation and alternative splicing detection processes in a comprehensive and systematic way, thus considerably aiding in the ensuing manual curation efforts.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC540277PMC
http://dx.doi.org/10.1101/gr.2889405DOI Listing

Publication Analysis

Top Keywords

alternative splicing
12
alternatively spliced
8
sequence evidence
8
annotation
6
gene
4
gene alternative
4
splicing annotation
4
annotation air
4
air designing
4
designing effective
4

Similar Publications

Unlabelled: To overcome the paucity of known tumor-specific surface antigens in pediatric high-grade glioma (pHGG), we contrasted splicing patterns in pHGGs and normal brain samples. Among alternative splicing events affecting extracellular protein domains, the most pervasive alteration was the skipping of ≤30 nucleotide-long microexons. Several of these skipped microexons mapped to L1-IgCAM family members, such as .

View Article and Find Full Text PDF

Psychedelics engage the serotonergic system as potent neuromodulators, increasing neuroplasticity in humans and rodents. Persistent changes in cognitive flexibility, emotional regulation, and social cognition are thought to underlie the therapeutic effects of psychedelics. However, the underlying molecular and cellular basis of psychedelic-induced plasticity remains unclear.

View Article and Find Full Text PDF

Genomic instability is the main cause of abnormal embryo development and abortion. NLRP7 dysfunctions affect embryonic development and lead to Hydatidiform Moles, but the underlying mechanisms remain largely elusive. Here, we show that NLRP7 knockout affects the genetic stability, resulting in increased DNA damage in both human embryonic stem cells and blastoids, making embryonic cells in blastoids more susceptible to apoptosis.

View Article and Find Full Text PDF

The cryptic lncRNA-encoded microprotein TPM3P9 drives oncogenic RNA splicing and tumorigenesis.

Signal Transduct Target Ther

January 2025

MOE Key Laboratory of Tumor Molecular Biology and State Key Laboratory of Bioactive Molecules and Druggability Assessment, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, Guangzhou, 510632, China.

Emerging evidence demonstrates that cryptic translation from RNAs previously annotated as noncoding might generate microproteins with oncogenic functions. However, the importance and underlying mechanisms of these microproteins in alternative splicing-driven tumor progression have rarely been studied. Here, we show that the novel protein TPM3P9, encoded by the lncRNA tropomyosin 3 pseudogene 9, exhibits oncogenic activity in clear cell renal cell carcinoma (ccRCC) by enhancing oncogenic RNA splicing.

View Article and Find Full Text PDF

NUMB alternative splicing and isoform specific functions in development and disease.

J Biol Chem

January 2025

The Arthur and Sonia Labatt Brain Tumour Research Centre, The Hospital for Sick Children, 555 University Avenue, Toronto, ON, Canada, M5G 1X8; Department of Medical Biophysics, University of Toronto, 610 University Avenue, Toronto, ON, Canada, M5G 2M9. Electronic address:

The NUMB gene encodes a conserved adaptor protein with roles in asymmetric cell division and cell fate determination. First described as an inhibitor of Notch signaling, multi-functional NUMB proteins regulate multiple cellular pathways through protein complexes with ubiquitin ligases, polarity proteins and the endocytic machinery. The vertebrate NUMB protein isoforms were identified over two decades ago, yet the majority of functional studies exploring NUMB function in endocytosis, cell migration and adhesion, development and disease have largely neglected the potential for distinct isoform activity in design and interpretation.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!