Long non-coding RNA transcriptome of uncharacterized samples can be accurately imputed using protein-coding genes.

Brief Bioinform

Department of Experimental and Clinical Pharmacology, University of Minnesota, Minneapolis, MN, USA.

Published: March 2020

Long non-coding RNAs (lncRNAs) play an important role in gene regulation and are increasingly being recognized as crucial mediators of disease pathogenesis. However, the vast majority of published transcriptome datasets lack high-quality lncRNA profiles compared to protein-coding genes (PCGs). Here we propose a framework to harnesses the correlative expression patterns between lncRNA and PCGs to impute unknown lncRNA profiles. The lncRNA expression imputation (LEXI) framework enables characterization of lncRNA transcriptome of samples lacking any lncRNA data using only their PCG profiles. We compare various machine learning and missing value imputation algorithms to implement LEXI and demonstrate the feasibility of this approach to impute lncRNA transcriptome of normal and cancer tissues. Additionally, we determine the factors that influence imputation accuracy and provide guidelines for implementing this approach.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7299297PMC
http://dx.doi.org/10.1093/bib/bby129DOI Listing

Publication Analysis

Top Keywords

long non-coding
8
protein-coding genes
8
lncrna profiles
8
lncrna transcriptome
8
lncrna
7
non-coding rna
4
transcriptome
4
rna transcriptome
4
transcriptome uncharacterized
4
uncharacterized samples
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!