Next-generation sequencing technologies are poised to revolutionize the field of biomedical research. The increased resolution of these data promise to provide a greater understanding of the molecular processes that control the morphology and behavior of a cell. However, the increased amounts of data require innovative statistical procedures that are powerful while still being computationally feasible. In this article, we present a method for identifying small RNA molecules, called miRNAs, which regulate genes by targeting their mRNAs for degradation or translational repression. In the first step of our modeling procedure, we apply an innovative dynamic linear model that identifies candidate miRNA genes in high-throughput sequencing data. The model is flexible and can accurately identify interesting biological features while accounting for both the read count, read spacing, and sequencing depth. Additionally, miRNA candidates are also processed using a modified Smith-Waterman sequence alignment that scores the regions for potential RNA hairpins, one of the defining features of miRNAs. We illustrate our method on simulated datasets as well as on a small RNA Caenorhabditis elegans dataset from the Illumina sequencing platform. These examples show that our method is highly sensitive for identifying known and novel miRNA genes.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3116054PMC
http://dx.doi.org/10.1111/j.1541-0420.2010.01570.xDOI Listing

Publication Analysis

Top Keywords

dynamic linear
8
linear model
8
next-generation sequencing
8
sequencing data
8
mirna genes
8
sequencing
5
model identification
4
identification mirnas
4
mirnas next-generation
4
data
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!