A computational system for identifying operons based on RNA-seq data.

Methods

Department of Computer Science, Wellesley College, Wellesley, MA 02481, USA. Electronic address:

Published: April 2020

An operon is a set of neighboring genes in a genome that is transcribed as a single polycistronic message. Genes that are part of the same operon often have related functional roles or participate in the same metabolic pathways. The majority of all bacterial genes are co-transcribed with one or more other genes as part of a multi-gene operon. Thus, accurate identification of operons is important in understanding co-regulation of genes and their functional relationships. Here, we present a computational system that uses RNA-seq data to determine operons throughout a genome. The system takes the name of a genome and one or more files of RNA-seq data as input. Our method combines primary genomic sequence information with expression data from the RNA-seq files in a unified probabilistic model in order to identify operons. We assess our method's ability to accurately identify operons in a range of species through comparison to external databases of operons, both experimentally confirmed and computationally predicted, and through focused experiments that confirm new operons identified by our method. Our system is freely available at https://cs.wellesley.edu/~btjaden/Rockhopper/.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6776731PMC
http://dx.doi.org/10.1016/j.ymeth.2019.03.026DOI Listing

Publication Analysis

Top Keywords

rna-seq data
12
computational system
8
identify operons
8
operons
7
genes
5
system identifying
4
identifying operons
4
operons based
4
rna-seq
4
based rna-seq
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!