Annotation of polyadenylation sites from short-read RNA sequencing alone is a challenging computational task. Other algorithms rooted in DNA sequence predict potential polyadenylation sites; however, in vivo expression of a particular site varies based on a myriad of conditions. Here, we introduce aptardi (alternative polyadenylation transcriptome analysis from RNA-Seq data and DNA sequence information), which leverages both DNA sequence and RNA sequencing in a machine learning paradigm to predict expressed polyadenylation sites. Specifically, as input aptardi takes DNA nucleotide sequence, genome-aligned RNA-Seq data, and an initial transcriptome. The program evaluates these initial transcripts to identify expressed polyadenylation sites in the biological sample and refines transcript 3'-ends accordingly. The average precision of the aptardi model is twice that of a standard transcriptome assembler. In particular, the recall of the aptardi model (the proportion of true polyadenylation sites detected by the algorithm) is improved by over three-fold. Also, the model-trained using the Human Brain Reference RNA commercial standard-performs well when applied to RNA-sequencing samples from different tissues and different mammalian species. Finally, aptardi's input is simple to compile and its output is easily amenable to downstream analyses such as quantitation and differential expression.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7955126PMC
http://dx.doi.org/10.1038/s41467-021-21894-xDOI Listing

Publication Analysis

Top Keywords

polyadenylation sites
24
dna sequence
16
rna sequencing
12
rna-seq data
8
expressed polyadenylation
8
aptardi model
8
polyadenylation
7
sites
6
aptardi
5
dna
5

Similar Publications

Objective: The Heat Shock Protein 70 (HSP70) family is a highly conserved group of molecular chaperones essential for maintaining cellular homeostasis. These proteins are necessary for protein folding, assembly, and degradation and involve cell recovery from stress conditions. HSP70 proteins are upregulated in response to heat shock, oxidative stress, and pathogenic infections.

View Article and Find Full Text PDF

CPSF1 inhibition promotes widespread use of intergenic polyadenylation sites and impairs glycolysis in prostate cancer cells.

Cell Rep

January 2025

Masonic Cancer Center, University of Minnesota, Minneapolis, MN 55455, USA; Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA. Electronic address:

Localized prostate cancer can be cured by radiation or surgery, but advanced prostate cancer continues to be a clinical challenge. Altered alternative polyadenylation occurs in numerous cancers and can downregulate tumor-suppressor genes and upregulate oncogenes. We found that the cleavage and polyadenylation specificity factor (CPSF) complex factor CPSF1 is upregulated in patients with advanced prostate cancer, with high CPSF1 expression correlating with worse progression-free survival.

View Article and Find Full Text PDF

A splice donor in influences keratinocyte immortalization by beta-HPV49.

J Virol

January 2025

Institute for Medical Virology and Epidemiology of Viral Diseases, University of Tuebingen, Tuebingen, Germany.

Human papillomaviruses (HPV) from the genus beta have been implicated in the development of cutaneous squamous cell cancer in and organ transplant patients. In contrast to alpha-high-risk HPV, which cause ano-genital and oropharyngeal cancers, beta-HPV replication is not well understood. The beta-HPV49 transcriptome was analyzed by RNA sequencing using stable keratinocyte cell lines maintaining high levels of extrachromosomally replicating E8- genomes, which can be established due to a lack of the viral E8^E2 repressor protein.

View Article and Find Full Text PDF

Background: Glioblastoma multiforme (GBM) is characterized by its cellular complexity, with a microenvironment consisting of diverse cell types, including oligodendrocyte precursor cells (OPCs) and neoplastic CD133 + radial glia-like cells. This study focuses on exploring the distinct cellular transitions in GBM, emphasizing the role of alternative polyadenylation (APA) in modulating microRNA-binding and post-transcriptional regulation.

Results: Our research identified unique APA profiles that signify the transitional phases between neoplastic cells and OPCs, underscoring the importance of APA in cellular identity and transformation in GBM.

View Article and Find Full Text PDF

is an obligate human parasite of the phylum Apicomplexa and is the causative agent of the most lethal form of human malaria. Although N6-methyladenosine modification is thought to be one of the major post-transcriptional regulatory mechanisms for stage-specific gene expression in apicomplexan parasites, the precise base position of m6A in mRNAs or noncoding RNAs in these parasites remains unknown. Here, we report global nucleotide-resolution mapping of m6A residues in using DART-seq technology, which quantitatively displayed a stage-specific, dynamic distribution pattern with enrichment near mRNA 3' ends.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!