RNA methylation, particularly through m6A modification, represents a crucial epigenetic mechanism that governs gene expression and influences a range of biological functions. Accurate identification of methylation sites is crucial for understanding their biological functions. Traditional experimental methods, however, are often costly and can be influenced by experimental conditions, making machine learning, especially deep learning techniques, a vital tool for m6A site identification. Despite their utility, current machine learning models struggle with unbalanced datasets, a common issue in bioinformatics. This study addresses the RNA methylation site data imbalance problem from three key perspectives: feature encoding representation, deep learning models, and data resampling strategies. Using the K-mer one-hot encoding strategy, we effectively extracted RNA sequence features and developed classification prediction models utilizing long short-term memory networks (LSTM) and its variant, Multiplicative LSTM (mLSTM). We further enhanced model performance by ensemble and weighted strategy models. Additionally, we utilized the sequence generative adversarial network (SeqGAN) and the synthetic minority resampling technique (SMOTE) to construct balanced datasets for RNA methylation sites. The prediction results were rigorously analyzed using the Wilcoxon test and multivariate linear regression to explore the effects of different K-mer values, model architectures, and sampling methods on classification outcomes. The analysis underscored the significant impact of feature selection, model architecture, and sampling techniques in addressing data imbalance. Notably, the optimal prediction performance was achieved with a K value of 5 using the mLSTM-ensemble model. These findings not only offer new insights and methodologies for RNA methylation site identification but also provide valuable guidance for addressing similar challenges in bioinformatics.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.compbiolchem.2025.108351DOI Listing

Publication Analysis

Top Keywords

rna methylation
16
data imbalance
12
machine learning
12
m6a modification
8
biological functions
8
methylation sites
8
deep learning
8
site identification
8
learning models
8
methylation site
8

Similar Publications

Novel p.Arg534del Mutation and MTHFR C667T Polymorphism in Fragile X Syndrome (FXS) With Autism Spectrum Phenotype: A Case Report.

Case Rep Genet

January 2025

Medical Investigation of Neurodevelopmental Disorders (MIND) Institute, University of California, 2825 50th Street, Davis, Sacramento 95817, California, USA.

Fragile X syndrome (FXS) presents with autism spectrum disorder (ASD), intellectual disability, developmental delay, seizures, hypotonia during infancy, joint laxity, behavioral issues, and characteristic facial features. The predominant mechanism is due to CGG trinucleotide repeat expansion of more than 200 repeats in the 5'UTR (untranslated region) of (Fragile X Messenger Ribonucleoprotein 1) causing promoter methylation and transcriptional silencing. However, not all patients presenting with the characteristic phenotype and point/frameshift mutations with deletions in have been described in the literature.

View Article and Find Full Text PDF

TET2-mediated 5-hydroxymethylcytosine of TXNIP promotes cell cycle arrest in systemic anaplastic large cell lymphoma.

Clin Epigenetics

January 2025

Department of Pathology, Clinical Oncology School of Fujian Medical University, Fujian Cancer Hospital, Fuzhou, China.

Background: 5-Hydroxymethylcytosine (5hmC) modification represents a significant epigenetic modification within DNA, playing a pivotal role in a range of biological processes associated with various types of cancer. The role of 5hmC in systemic anaplastic large cell lymphoma (ALCL) has not been thoroughly investigated. This study aims to examine the function of 5hmC in the advancement of ALCL.

View Article and Find Full Text PDF

Atrial remodeling is a major pathophysiological mechanism of atrial fibrillation (AF). Atrial remodeling progresses with aging and background diseases, including hypertension, heart failure, and AF itself. However, its mechanism of action and reversibility have not been completely elucidated.

View Article and Find Full Text PDF

RNA methylation, particularly through m6A modification, represents a crucial epigenetic mechanism that governs gene expression and influences a range of biological functions. Accurate identification of methylation sites is crucial for understanding their biological functions. Traditional experimental methods, however, are often costly and can be influenced by experimental conditions, making machine learning, especially deep learning techniques, a vital tool for m6A site identification.

View Article and Find Full Text PDF

Cytotoxic DNAs, methylation, histones and histones binding proteins are speculated to induce DNA sensors. Under stressed condition, the antigenic patterns, PAMPs and DAMPs, trigger the hyperactive innate response through DNA, DNA-RNA hybrids, oligonucleotides, histones and mtDNA to initiate cGAMP-STING-IFN I cascade. HSV -1&2, HIV, Varicella- Zoster virus, Polyomavirus, Cytomegalovirus, and KSHV negatively regulate the STING-MAVS-TBK-1/1KKE pathway.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!