An operon is a set of neighboring genes in a genome that is transcribed as a single polycistronic message. Genes that are part of the same operon often have related functional roles or participate in the same metabolic pathways. The majority of all bacterial genes are co-transcribed with one or more other genes as part of a multi-gene operon. Thus, accurate identification of operons is important in understanding co-regulation of genes and their functional relationships. Here, we present a computational system that uses RNA-seq data to determine operons throughout a genome. The system takes the name of a genome and one or more files of RNA-seq data as input. Our method combines primary genomic sequence information with expression data from the RNA-seq files in a unified probabilistic model in order to identify operons. We assess our method's ability to accurately identify operons in a range of species through comparison to external databases of operons, both experimentally confirmed and computationally predicted, and through focused experiments that confirm new operons identified by our method. Our system is freely available at https://cs.wellesley.edu/~btjaden/Rockhopper/.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6776731 | PMC |
http://dx.doi.org/10.1016/j.ymeth.2019.03.026 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!