TALC: Transcript-level Aware Long-read Correction.

Bioinformatics

Department of Genome Dynamics, Institut de Génétique Humaine, Centre National de la Recherche Scientifique (CNRS), Université de Montpellier, Montpellier 34396, France.

Published: December 2020

Motivation: Long-read sequencing technologies are invaluable for determining complex RNA transcript architectures but are error-prone. Numerous 'hybrid correction' algorithms have been developed for genomic data that correct long reads by exploiting the accuracy and depth of short reads sequenced from the same sample. These algorithms are not suited for correcting more complex transcriptome sequencing data.

Results: We have created a novel reference-free algorithm called Transcript-level Aware Long-Read Correction (TALC) which models changes in RNA expression and isoform representation in a weighted De Bruijn graph to correct long reads from transcriptome studies. We show that transcript-level aware correction by TALC improves the accuracy of the whole spectrum of downstream RNA-seq applications and is thus necessary for transcriptome analyses that use long read technology.

Availability And Implementation: TALC is implemented in C++ and available at https://github.com/lbroseus/TALC.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btaa634DOI Listing

Publication Analysis

Top Keywords

transcript-level aware
12
aware long-read
8
long-read correction
8
correct long
8
long reads
8
correction talc
8
talc
4
talc transcript-level
4
correction motivation
4
motivation long-read
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!