iDNA6mA-Rice: A Computational Tool for Detecting N6-Methyladenine Sites in Rice.

Hao Lv Fu-Ying Dao Zheng-Xing Guan Dan Zhang Jiu-Xin Tan Yong Zhang Wei Chen Hao Lin

Front Genet

Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.

Published: September 2019

DNA N6-methyladenine (6mA) is a dominant DNA modification form and involved in many biological functions. The accurate genome-wide identification of 6mA sites may increase understanding of its biological functions. Experimental methods for 6mA detection in eukaryotes genome are laborious and expensive. Therefore, it is necessary to develop computational methods to identify 6mA sites on a genomic scale, especially for plant genomes. Based on this consideration, the study aims to develop a machine learning-based method of predicting 6mA sites in the rice genome. We initially used mono-nucleotide binary encoding to formulate positive and negative samples. Subsequently, the machine learning algorithm named Random Forest was utilized to perform the classification for identifying 6mA sites. Our proposed method could produce an area under the receiver operating characteristic curve of 0.964 with an overall accuracy of 0.917, as indicated by the fivefold cross-validation test. Furthermore, an independent dataset was established to assess the generalization ability of our method. Finally, an area under the receiver operating characteristic curve of 0.981 was obtained, suggesting that the proposed method had good performance of predicting 6mA sites in the rice genome. For the convenience of retrieving 6mA sites, on the basis of the computational method, we built a freely accessible web server named iDNA6mA-Rice at http://lin-group.cn/server/iDNA6mA-Rice.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6746913	PMC
http://dx.doi.org/10.3389/fgene.2019.00793	DOI Listing

Publication Analysis

Top Keywords

6ma sites

sites rice

6ma

biological functions

predicting 6ma

rice genome

proposed method

area receiver

receiver operating

operating characteristic

Similar Publications

Comprehensive genome annotation of the model ciliate Tetrahymena thermophila by in-depth epigenetic and transcriptomic profiling.

Nucleic Acids Res

January 2025

MOE Key Laboratory of Evolution & Marine Biodiversity and Institute of Evolution & Marine Biodiversity, Ocean University of China, Qingdao 266003, China.

Fei Ye Xiao Chen Yuan Li Aili Ju Yalan Sheng

The ciliate Tetrahymena thermophila is a well-established unicellular model eukaryote, contributing significantly to foundational biological discoveries. Despite its acknowledged importance, current studies on Tetrahymena biology face challenges due to gene annotation inaccuracy, particularly the notable absence of untranslated regions (UTRs). To comprehensively annotate the Tetrahymena macronuclear genome, we collected extensive transcriptomic data spanning various cell stages.

View Article and Find Full Text PDF

Similar Publications

Critical assessment of nanopore sequencing for the detection of multiple forms of DNA modifications.

bioRxiv

November 2024

Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.

Yimeng Kong Yanchun Zhang Edward A Mead Hao Chen Christian E Loo

While nanopore sequencing is increasingly used for mapping DNA modifications, it is important to recognize false positive calls as they can mislead biological interpretations. To assist biologists and methods developers, we describe a framework for rigorous evaluation that highlights the use of false discovery rate with rationally designed negative controls capturing both general background and confounding modifications. Our critical assessment across multiple forms of DNA modifications highlights that while nanopore sequencing performs reliably for high-abundance modifications, including 5-methylcytosine (5mC) at CpG sites in mammalian cells and 5-hydroxymethylcytosine (5hmC) in mammalian brain cells, it makes a significant proportion of false positive detections for low-abundance modifications, such as 5mC at CpH sites, 5hmC and N6-methyldeoxyadenine (6mA) in most mammal cell types.

View Article and Find Full Text PDF

Similar Publications

PSATF-6mA: an integrated learning fusion feature-encoded DNA-6 mA methylcytosine modification site recognition model based on attentional mechanisms.

Front Genet

November 2024

School of Cyber Science and Engineering, University of International Relations, Beijing, China.

Yanmei Kang Hongyuan Wang Yubo Qin Guanlin Liu Yi Yu

DNA methylation is of crucial importance for biological genetic expression, such as biological cell differentiation and cellular tumours. The identification of DNA-6mA sites using traditional biological experimental methods requires more cumbersome steps and a large amount of time. The advent of neural network technology has facilitated the identification of 6 mA sites on cross-species DNA with enhanced efficacy.

View Article and Find Full Text PDF

Similar Publications

iDNA-ITLM: An interpretable and transferable learning model for identifying DNA methylation.

PLoS One

October 2024

School of Information and Communication Engineering, Hainan University, Haikou, Hainan, China.

Xia Yu Cui Yani Zhichao Wang Haixia Long Rao Zeng

In this study, from the perspective of image processing, we propose the iDNA-ITLM model, using a novel data enhance strategy by continuously self-replicating a short DNA sequence into a longer DNA sequence and then embedding it into a high-dimensional matrix to enlarge the receptive field, for identifying DNA methylation sites. Our model consistently outperforms the current state-of-the-art sequence-based DNA methylation site recognition methods when evaluated on 17 benchmark datasets that cover multiple species and include three DNA methylation modifications (4mC, 5hmC, and 6mA). The experimental results demonstrate the robustness and superior performance of our model across these datasets.

View Article and Find Full Text PDF

Similar Publications

Deep learning based method for predicting DNA N6-methyladenosine sites.

Methods

October 2024

Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China. Electronic address:

Ke Han Jianchun Wang Ying Chu Qian Liao Yijie Ding

DNA N6 methyladenine (6mA) plays an important role in many biological processes, and accurately identifying its sites helps one to understand its biological effects more comprehensively. Previous traditional experimental methods are very labor-intensive and traditional machine learning methods also seem to be somewhat insufficient as the database of 6mA methylation groups becomes progressively larger, so we propose a deep learning-based method called multi-scale convolutional model based on global response normalization (CG6mA) to solve the prediction problem of 6mA site. This method is tested with other methods on three different kinds of benchmark datasets, and the results show that our model can get more excellent prediction results.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!