Motivation: Long noncoding RNA (lncRNA) has been verified to interact with other biomolecules especially protein-coding genes (PCGs), thus playing essential regulatory roles in life activities and disease development. However, the inner mechanisms of most lncRNA-PCG relationships are still unclear. Our study investigated the characteristics of true lncRNA-PCG relationships and constructed a novel predictor with machine learning algorithms.
Results: We obtained the 307 true lncRNA-PCG pairs from database and found that there are significant differences in multiple characteristics between true and random lncRNA-PCG sets. Besides, 3-fold cross-validation and prediction results on independent test sets show the great AUC values of LR, SVM and RF, among which RF has the best performance with average AUC 0.818 for cross-validation, 0.823 and 0.853 for two independent test sets, respectively. In case study, some candidate lncRNA-PCG relationships in colorectal cancer were found and HOTAIR-COMP interaction was specially exemplified. The proportion of the reported pairs in the predicted positive results was significantly higher than that in negative results (P < 0.05).
Supplementary Information: Supplementary data are available at Bioinformatics online.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1093/bioinformatics/btz886 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!