Hidden treasures in unspliced EST data.

Theory Biosci

Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center of Bioinformatics, University of Leipzig, Härtelstrasse 16-18, 04107 Leipzig, Germany.

Published: May 2012

Several classes of exclusively--or at least predominantly--unspliced non-coding RNAs have been described in the last years, including totally and partially intronic transcripts and long intergenic RNAs. Functionally, they appear to be involved in regulating gene expression, at least in part by associating with the chromatin. Intron-less transcripts have received little attention, even though recent findings indicate that intron-less protein-coding genes have several features that set them apart from the more abundant and much better understood spliced mRNAs. Even less is known about unspliced non-coding transcripts. Thus we systematically analyze the distribution of unspliced ESTs in the human genome. These form a large source of transcriptomic data that is almost always excluded from detailed studies. Most unspliced ESTs appear in clusters overlapping, or located in the close vicinity of, annotated RefSeq genes. Partially intronic unspliced ESTs show complex patterns of overlap with the intron/exon structure of the RefSeq gene. Distinctive patterns of CAGE tags indicate that a large class of unspliced EST clusters is forming long extensions of 3'UTRs, at least several hundreds of which probably appear also as independent 3'UTR-associated RNAs.

Download full-text PDF

Source
http://dx.doi.org/10.1007/s12064-012-0151-6DOI Listing

Publication Analysis

Top Keywords

unspliced ests
12
unspliced est
8
partially intronic
8
unspliced
6
hidden treasures
4
treasures unspliced
4
est data
4
data classes
4
classes exclusively--or
4
exclusively--or predominantly--unspliced
4

Similar Publications

Evolution of the unspliced transcriptome.

BMC Evol Biol

August 2015

Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Haertelstraße 16-18, Leipzig, D-04107, Germany.

Background: Despite their abundance, unspliced EST data have received little attention as a source of information on non-coding RNAs. Very little is know, therefore, about the genomic distribution of unspliced non-coding transcripts and their relationship with the much better studied regularly spliced products. In particular, their evolution has remained virtually unstudied.

View Article and Find Full Text PDF

Background: Expressed sequences (e.g. ESTs) are a strong source of evidence to improve gene structures and predict reliable alternative splicing events.

View Article and Find Full Text PDF

Hidden treasures in unspliced EST data.

Theory Biosci

May 2012

Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center of Bioinformatics, University of Leipzig, Härtelstrasse 16-18, 04107 Leipzig, Germany.

Several classes of exclusively--or at least predominantly--unspliced non-coding RNAs have been described in the last years, including totally and partially intronic transcripts and long intergenic RNAs. Functionally, they appear to be involved in regulating gene expression, at least in part by associating with the chromatin. Intron-less transcripts have received little attention, even though recent findings indicate that intron-less protein-coding genes have several features that set them apart from the more abundant and much better understood spliced mRNAs.

View Article and Find Full Text PDF

Identification of novel exons and transcribed regions by chimpanzee transcriptome sequencing.

Genome Biol

December 2010

Department of Genetics and Pathology, Rudbeck laboratory, Uppsala University, SE-751 85 Uppsala, Sweden.

Background: We profile the chimpanzee transcriptome by using deep sequencing of cDNA from brain and liver, aiming to quantify expression of known genes and to identify novel transcribed regions.

Results: Using stringent criteria for transcription, we identify 12,843 expressed genes, with a majority being found in both tissues. We further identify 9,826 novel transcribed regions that are not overlapping with annotated exons, mRNAs or ESTs.

View Article and Find Full Text PDF

We studied the genomic positions of 38,129 putative ncRNAs from the RIKEN dataset in relation to protein-coding genes. We found that the dataset has 41% sense, 6% antisense, 24% intronic and 29% intergenic transcripts. Interestingly, 17,678 (47%) of the FANTOM3 transcripts were found to potentially be internally primed from longer transcripts.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!