Readon: a novel algorithm to identify read-through transcripts with long-read sequencing data.

Bioinformatics

Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China.

Published: June 2024

Motivation: There are many clustered transcriptionally active regions in the human genome, in which the transcription complex cannot immediately terminate transcription at the upstream gene termination site, but instead continues to transcribe intergenic regions and downstream genes, resulting in read-through transcripts. Several studies have demonstrated the regulatory roles of read-through transcripts in tumorigenesis and development. However, limited by the read length of next-generation sequencing, discovery of read-through transcripts has been slow. For long but also erroneous third-generation sequencing data, this study developed a novel minimizer sketch algorithm to accurately and quickly identify read-through transcripts.

Results: Readon initially splits the reference sequence into distinct active regions. It employs a sliding window approach within each region, calculates minimizers, and constructs the specialized structured arrays for query indexing. Following initial alignment anchor screening of candidate read-through transcripts, further confirmation steps are executed. Comparative assessments against existing software reveal Readon's superior performance on both simulated and validated real data. Additionally, two downstream tools are provided: one for predicting whether a read-through transcript is likely to undergo nonsense-mediated decay or encodes a protein, and another for visualizing splicing patterns.

Availability And Implementation: Readon is freely available on GitHub (https://github.com/Bulabula45/Readon).

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11162696PMC
http://dx.doi.org/10.1093/bioinformatics/btae336DOI Listing

Publication Analysis

Top Keywords

read-through transcripts
20
identify read-through
8
sequencing data
8
active regions
8
read-through
7
transcripts
5
readon novel
4
novel algorithm
4
algorithm identify
4
transcripts long-read
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!