Motivation: A number of long non-coding RNAs (lncRNAs) have been identified by deep sequencing methods, but their molecular and cellular functions are known only for a limited number of lncRNAs. Current databases on lncRNAs are mostly for cataloging purpose without providing in-depth information required to infer functions. A comprehensive resource on lncRNA function is an immediate need.
Results: We present a database for functional investigation of lncRNAs that encompasses annotation, sequence analysis, gene expression, protein binding and phylogenetic conservation. We have compiled lncRNAs for six species (human, mouse, zebrafish, fruit fly, worm and yeast) from ENSEMBL, HGNC, MGI and lncRNAdb. Each lncRNA was analyzed for coding potential and phylogenetic conservation in different lineages. Gene expression data of 208 RNA-Seq studies (4995 samples), collected from GEO, ENCODE, modENCODE and TCGA databases, were used to provide expression profiles in various tissues, diseases and developmental stages. Importantly, we analyzed RNA-Seq data to identify coexpressed mRNAs that would provide ample insights on lncRNA functions. The resulting gene list can be subject to enrichment analysis such as Gene Ontology or KEGG pathways. Furthermore, we compiled protein-lncRNA interactions by collecting and analyzing publicly available CLIP-seq or PAR-CLIP sequencing data. Finally, we explored evolutionarily conserved lncRNAs with correlated expression between human and six other organisms to identify functional lncRNAs. The whole contents are provided in a user-friendly web interface.
Availability And Implementation: lncRNAtor is available at http://lncrnator.ewha.ac.kr/.
Supplementary Information: Supplementary data are available at Bioinformatics online.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1093/bioinformatics/btu325 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!