Background: mRNA polyadenylation is an essential step of pre-mRNA processing in eukaryotes. Accurate prediction of the pre-mRNA 3'-end cleavage/polyadenylation sites is important for defining the gene boundaries and understanding gene expression mechanisms.

Results: 28761 human mapped poly(A) sites have been classified into three classes containing different known forms of polyadenylation signal (PAS) or none of them (PAS-strong, PAS-weak and PAS-less, respectively) and a new computer program POLYAR for the prediction of poly(A) sites of each class was developed. In comparison with polya_svm (till date the most accurate computer program for prediction of poly(A) sites) while searching for PAS-strong poly(A) sites in human sequences, POLYAR had a significantly higher prediction sensitivity (80.8% versus 65.7%) and specificity (66.4% versus 51.7%) However, when a similar sort of search was conducted for PAS-weak and PAS-less poly(A) sites, both programs had a very low prediction accuracy, which indicates that our knowledge about factors involved in the determination of the poly(A) sites is not sufficient to identify such polyadenylation regions.

Conclusions: We present a new classification of polyadenylation sites into three classes and a novel computer program POLYAR for prediction of poly(A) sites/regions of each of the class. In tests, POLYAR shows high accuracy of prediction of the PAS-strong poly(A) sites, though this program's efficiency in searching for PAS-weak and PAS-less poly(A) sites is not very high but is comparable to other available programs. These findings suggest that additional characteristics of such poly(A) sites remain to be elucidated. POLYAR program with a stand-alone version for downloading is available at http://cub.comsats.edu.pk/polyapredict.htm.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3053588PMC
http://dx.doi.org/10.1186/1471-2164-11-646DOI Listing

Publication Analysis

Top Keywords

polya sites
40
computer program
16
prediction polya
16
sites
12
pas-weak pas-less
12
polya
11
prediction
8
program prediction
8
sites human
8
human sequences
8

Similar Publications

Background: In trypanosomatids, a group of unicellular eukaryotes that includes numerous important human parasites, cis-splicing has been previously reported for only two genes: a poly(A) polymerase and an RNA helicase. Conversely, trans-splicing, which involves the attachment of a spliced leader sequence, is observed for nearly every protein-coding transcript. So far, our understanding of splicing in this protistan group has stemmed from the analysis of only a few medically relevant species.

View Article and Find Full Text PDF

In humans, misprocessed mRNAs containing intact 5' Splice Site (5'SS) motifs are nuclear retained and targeted for decay by ZFC3H1, a component of the Poly(A) Exosome Targeting complex, and U1-70K, a component of the U1 snRNP. In , the ZFC3H1 homolog, Red1, binds to the YTH domain-containing protein Mmi1 and targets certain RNA transcripts to nuclear foci for nuclear retention and decay. Here we show that YTHDC1 and YTHDC2, two YTH domain-containing proteins that bind to -6-methyladenosine (m6A) modified RNAs, interact with ZFC3H1 and U1-70K, and are required for the nuclear retention of mRNAs with intact 5'SS motifs.

View Article and Find Full Text PDF

Background: Long non-coding RNAs (lncRNAs) play important roles in various biological processes, including stage development in plants. N-methyladenosine (mA) modification and polyadenylation are noteworthy regulatory processes that impact transcript functions by modulating their abundance. However, the specific landscapes of mA modification and polyadenylation on lncRNAs remain largely unexplored.

View Article and Find Full Text PDF

RNA isoform diversity, produced via alternative splicing, and alternative usage of transcription start and poly(A) sites, results in varied transcripts being derived from the same gene. Distinct isoforms can play important biological roles, including by changing the sequences or expression levels of protein products. The first single-cell approaches to RNA sequencing-and later, spatial approaches-which are now widely used for the identification of differentially expressed genes, rely on short reads and offer the ability to transcriptomically compare different cell types but are limited in their ability to measure differential isoform expression.

View Article and Find Full Text PDF

Post-transcriptional processing and modification of viral RNA, including alternative splicing, polyadenylation, and methylation, play crucial roles in regulating viral gene expression, enhancing genomic stability, and increasing replication efficiency. These processes have significant implications for viral biology and antiviral therapies. In this study, using Oxford Nanopore Technology (ONT) direct RNA sequencing (DRS), we provided a comprehensive analysis of the transcriptome and epitranscriptome features of the HIV-1 B (NL4-3) subtype strain and, for the first time, characterized these features in the CRF01_AE (GX2005002) subtype strain.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!