The precise spatiotemporal expression of long noncoding RNAs (lncRNAs) plays a pivotal role in biological regulation, and aberrant expression of lncRNAs in different subcellular localizations has been intricately linked to the onset and progression of a variety of cancers. Computational methods provide effective means for predicting lncRNA subcellular localization, but current studies either ignore cell line and tissue specificity or the correlation and shared information among cell lines. In this study, we propose a novel approach, BiGM-lncLoc, treating the prediction of lncRNA subcellular localization across cell lines as a multi-graph meta-learning task. Our investigation involves two categories of data: the localization data of nucleotide sequences in different cell lines and cell line expression data. BiGM-lncLoc comprises a cell line-specific optimization network learning specific knowledge from cell line expression data and a graph neural network optimized across cell lines. Subsequently, the specific and shared knowledge acquired through bi-level optimization is applied to a new cell-line prediction task without the need for re-training or fine-tuning. Additionally, through key feature analysis of the impact of different nucleotide combinations on the model, we confirm the necessity of cell line-specific studies based on correlation analysis. Finally, experiments conducted on various cell lines with different data sizes indicate that BiGM-lncLoc outperforms other methods in terms of prediction accuracy, with an average accuracy of 97.7%. After removing overlapping samples to ensure data independence for each cell line, the accuracy ranged from 82.4% to 94.7%, still surpassing existing models. Our code can be found at https://github.com/BioCL1/BiGM-lncLoc .
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1007/s12539-024-00679-y | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!