Biomolecules, microRNAs (miRNAs) and long non-coding RNAs (lncRNAs), play critical roles in diverse fundamental and vital biological processes. They can serve as disease biomarkers as their dysregulations could cause complex human diseases. Identifying those biomarkers is helpful with the diagnosis, treatment, prognosis, and prevention of diseases. In this study, we propose a factorization machine-based deep neural network with binary pairwise encoding, DFMbpe, to identify the disease-related biomarkers. First, to comprehensively consider the interdependence of features, a binary pairwise encoding method is designed to obtain the raw feature representations for each biomarker-disease pair. Second, the raw features are mapped into their corresponding embedding vectors. Then, the factorization machine is conducted to get the wide low-order feature interdependence, while the deep neural network is applied to obtain the deep high-order feature interdependence. Finally, two kinds of features are combined to get the final prediction results. Unlike other biomarker identification models, the binary pairwise encoding considers the interdependence of features even though they never appear in the same sample, and the DFMbpe architecture emphasizes both low-order and high-order feature interactions simultaneously. The experimental results show that DFMbpe greatly outperforms the state-of-the-art identification models on both cross-validation and independent dataset evaluation. Besides, three types of case studies further demonstrate the effectiveness of this model.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TCBB.2023.3235299 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!