RNA plays important roles in almost every aspect of biology, and every aspect of RNA biology is influenced by its folding. This is a particularly important consideration in the era of high-throughput sequencing, when the discovery of novel transcripts far outpaces our knowledge of their functions. To gain a comprehensive picture of biology requires a structural framework for making functional inferences on RNA. To this end we have developed the RNA Structurome Database ( https://structurome.bb.iastate.edu ), a comprehensive repository of RNA secondary structural information that spans the entire human genome. Here, we compile folding information for every base pair of the genome that may be transcribed: coding, noncoding, and intergenic regions, as well as repetitive elements, telomeres, etc. This was done by fragmenting the GRCh38 reference genome into 154,414,320 overlapping sequence fragments and, for each fragment, calculating a set of metrics based on the sequence's folding properties. These data will facilitate a wide array of investigations: e.g. discovery of structured regulatory elements in differential gene expression data or noncoding RNA discovery, as well as allow genome-scale analyses of RNA folding.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5722888 | PMC |
http://dx.doi.org/10.1038/s41598-017-17510-y | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!