Matataki: an ultrafast mRNA quantification method for large-scale reanalysis of RNA-Seq data.

BMC Bioinformatics

Graduate School of Information Sciences, Tohoku University, Sendai, Miyagi, Japan.

Published: July 2018

Background: Data generated by RNA sequencing (RNA-Seq) is now accumulating in vast amounts in public repositories, especially for human and mouse genomes. Reanalyzing these data has emerged as a promising approach to identify gene modules or pathways. Although meta-analyses of gene expression data are frequently performed using microarray data, meta-analyses using RNA-Seq data are still rare. This lag is partly due to the limitations in reanalyzing RNA-Seq data, which requires extensive computational resources. Moreover, it is nearly impossible to calculate the gene expression levels of all samples in a public repository using currently available methods. Here, we propose a novel method, Matataki, for rapidly estimating gene expression levels from RNA-Seq data.

Results: The proposed method uses k-mers that are unique to each gene for the mapping of fragments to genes. Since aligning fragments to reference sequences requires high computational costs, our method could reduce the calculation cost by focusing on k-mers that are unique to each gene and by skipping uninformative regions. Indeed, Matataki outperformed conventional methods with regards to speed while demonstrating sufficient accuracy.

Conclusions: The development of Matataki can overcome current limitations in reanalyzing RNA-Seq data toward improving the potential for discovering genes and pathways associated with disease at reduced computational cost. Thus, the main bottleneck of RNA-Seq analyses has shifted to achieving the decompression of sequenced data. The implementation of Matataki is available at https://github.com/informationsea/Matataki .

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6048772PMC
http://dx.doi.org/10.1186/s12859-018-2279-yDOI Listing

Publication Analysis

Top Keywords

rna-seq data
16
gene expression
12
data
9
limitations reanalyzing
8
reanalyzing rna-seq
8
expression levels
8
k-mers unique
8
unique gene
8
rna-seq
7
gene
6

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!