RNA-protein interaction (RPI) is crucial to the life processes of diverse organisms. Various researchers have identified RPI through long-term and high-cost biological experiments. Although numerous machine learning and deep learning-based methods for predicting RPI currently exist, their robustness and generalizability have significant room for improvement. This study proposes LPI-MFF, an RPI prediction model based on multi-source information fusion, to address these issues. The LPI-MFF employed protein-protein interactions features, sequence features, secondary structure features, and physical and chemical properties as the information sources with the corresponding coding scheme, followed by the random forest algorithm for feature screening. Finally, all information was combined and a classification method based on convolutional neural networks is used. The experimental results of fivefold cross-validation demonstrated that the accuracy of LPI-MFF on RPI1807 and NPInter was 97.60% and 97.67%, respectively. In addition, the accuracy rate on the independent test set RPI1168 was 84.9%, and the accuracy rate on the Mus musculus dataset was 90.91%. Accordingly, LPI-MFF demonstrated greater robustness and generalization than other prevalent RPI prediction methods.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10929084PMC
http://dx.doi.org/10.1186/s12859-024-05727-4DOI Listing

Publication Analysis

Top Keywords

random forest
8
forest algorithm
8
rpi prediction
8
accuracy rate
8
rpi
5
predicting lncrna-protein
4
lncrna-protein interactions
4
interactions deep
4
deep learning
4
learning framework
4

Similar Publications

Background: Despite the adverse health outcomes associated with longer duration diarrhea (LDD), there are currently no clinical decision tools for timely identification and better management of children with increased risk. This study utilizes machine learning (ML) to derive and validate a predictive model for LDD among children presenting with diarrhea to health facilities.

Methods: LDD was defined as a diarrhea episode lasting ≥ 7 days.

View Article and Find Full Text PDF

Development of a disease diagnostic model to predict the occurrence of central precocious puberty of female.

J Pediatr Endocrinol Metab

January 2025

Department of Endocrinology, Genetics and Metabolism, Beijing Children's Hospital, Capital Medical University, National Center for Children's Health, Beijing, China.

Objectives: To develop a clinical model for predicting the occurrence of Central Precocious Puberty based on the breast development outcomes in chinese girls.

Methods: This is a retrospective study, which included a total of 1,001 girls aged 6-9 years old who visited the outpatient clinic of Beijing Children's Hospital from January 2017 to October 2022 for "breast development". Participants were categorized into pubertal development (PD) cohort and simple premature breast development (PT) according to the criteria, and information was collected and tested for relevant indicators.

View Article and Find Full Text PDF

Preeclampsia (PE) is a major pregnancy-specific cardiovascular complication posing latent life-threatening risks to mothers and neonates. The contribution of immune dysregulation to PE is not fully understood, highlighting the need to explore molecular markers and their relationship with immune infiltration to potentially inform therapeutic strategies. We used bioinformatics tools to analyze gene expression data from the Gene Expression Omnibus (GEO) database using the GEOquery package in R.

View Article and Find Full Text PDF

Genomic prediction applies to any agro- or ecologically relevant traits, with distinct ontologies and genetic architectures. Selecting the most appropriate model for the distribution of genetic effects and their associated allele frequencies in the training population is crucial. Linear regression models are often preferred for genomic prediction.

View Article and Find Full Text PDF

Background: Primary testicular lymphoma (PTL) is a rare malignancy whose epidemiology and prognosis have not been studied.

Materials And Methods: PTL patient data were collected from the SEER online database, and the data were divided into a training cohort and a validation cohort according to random assignment. The training cohort was subjected to a one-way COX regression analysis, and statistically significant differences were included in the multi-factor COX regression analysis and constructed nomograms.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!