6mA-StackingCV: an improved stacking ensemble model for predicting DNA N6-methyladenine site.

BioData Min

College of Information Science and Engineering, Shaoyang University, Shaoyang, Hunan, 422000, China.

Published: November 2023

DNA N6-adenine methylation (N6-methyladenine, 6mA) plays a key regulating role in the cellular processes. Precisely recognizing 6mA sites is of importance to further explore its biological functions. Although there are many developed computational methods for 6mA site prediction over the past decades, there is a large root left to improve. We presented a cross validation-based stacking ensemble model for 6mA site prediction, called 6mA-StackingCV. The 6mA-StackingCV is a type of meta-learning algorithm, which uses output of cross validation as input to the final classifier. The 6mA-StackingCV reached the state of the art performances in the Rosaceae independent test. Extensive tests demonstrated the stability and the flexibility of the 6mA-StackingCV. We implemented the 6mA-StackingCV as a user-friendly web application, which allows one to restrictively choose representations or learning algorithms. This application is freely available at http://www.biolscience.cn/6mA-stackingCV/ . The source code and experimental data is available at https://github.com/Xiaohong-source/6mA-stackingCV .

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10680251PMC
http://dx.doi.org/10.1186/s13040-023-00348-8DOI Listing

Publication Analysis

Top Keywords

stacking ensemble
8
ensemble model
8
6ma site
8
site prediction
8
6ma-stackingcv
6
6ma-stackingcv improved
4
improved stacking
4
model predicting
4
predicting dna
4
dna n6-methyladenine
4

Similar Publications

GradeDiff-IM: An Ensembles Model-based Grade Classification of Breast Cancer.

Biomed Phys Eng Express

January 2025

School of Engineering and Computing, University of the West of Scotland, University of the West of Scotland - Paisley Campus, Paisley PA1 2BE, UK, City, Paisley, PA1 2BE, UNITED KINGDOM OF GREAT BRITAIN AND NORTHERN IRELAND.

Cancer grade classification is a challenging task identified from the cell structure of healthy and abnormal tissues. The partitioner learns about the malignant cell through the grading and plans the treatment strategy accordingly. A major portion of researchers used DL models for grade classification.

View Article and Find Full Text PDF

StackDILI: Enhancing Drug-Induced Liver Injury Prediction through Stacking Strategy with Effective Molecular Representations.

J Chem Inf Model

January 2025

Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, 518172 Shenzhen, China.

Drug-induced liver injury (DILI) is a major challenge in drug development, often leading to clinical trial failures and market withdrawals due to liver toxicity. This study presents StackDILI, a computational framework designed to accelerate toxicity assessment by predicting DILI risk. StackDILI integrates multiple molecular descriptors to extract structural and physicochemical features, including the constitution, pharmacophore, MACCS, and E-state descriptors.

View Article and Find Full Text PDF

In-silico prediction of protein biophysical traits is often hindered by the limited availability of experimental data and their heterogeneity. Training on limited data can lead to overfitting and poor generalizability to sequences distant from those in the training set. Additionally, inadequate use of scarce and disparate data can introduce biases during evaluation, leading to unreliable model performances being reported.

View Article and Find Full Text PDF

LncSL: A Novel Stacked Ensemble Computing Tool for Subcellular Localization of lncRNA by Amino Acid-Enhanced Features and Two-Stage Automated Selection Strategy.

Int J Mol Sci

December 2024

School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou 213164, China.

Long non-coding RNA (lncRNA) is a non-coding RNA longer than 200 nucleotides, crucial for functions like cell cycle regulation and gene transcription. Accurate localization prediction from sequence information is vital for understanding lncRNA's biological roles. Computational methods offer an effective alternative to traditional experimental methods for annotating lncRNA subcellular positions.

View Article and Find Full Text PDF
Article Synopsis
  • Small cell lung cancer (SCLC) is a highly aggressive cancer with poor survival rates, and current diagnostic methods are invasive and limited.
  • This study introduces a new machine learning technique that uses metabolomics data to distinguish between SCLC, non-small cell lung cancer (NSCLC), and healthy individuals, achieving high accuracy in classification.
  • Key metabolites were identified as important predictors, and the stacking ensemble model effectively combines different classifiers, providing a promising non-invasive alternative for early lung cancer detection.
View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!