Dual-stage optimizer for systematic overestimation adjustment applied to multi-objective genetic algorithms for biomarker selection.

Brief Bioinform

School of Medicine, Institute of Biomedicine, University of Eastern Finland, Yliopistonranta 1, PO Box 1627, 70211 Kuopio, Finland.

Published: November 2024

The selection of biomarker panels in omics data, challenged by numerous molecular features and limited samples, often requires the use of machine learning methods paired with wrapper feature selection techniques, like genetic algorithms. They test various feature sets-potential biomarker solutions-to fine-tune a machine learning model's performance for supervised tasks, such as classifying cancer subtypes. This optimization process is undertaken using validation sets to evaluate and identify the most effective feature combinations. Evaluations have performance estimation error, measurable as discrepancy between validation and test set performance, and when the selection involves many models the best ones are almost certainly overestimated. This issue is also relevant in a multi-objective feature selection process where various characteristics of the biomarker panels are optimized, such as predictive performances and feature set size. Methods have been proposed to reduce the overestimation after a model has already been selected in single-objective problems, but no algorithm existed capable of reducing the overestimation during the optimization, improving model selection, or applied in the more general multi-objective domain. We propose Dual-stage Optimizer for Systematic overestimation Adjustment in Multi-Objective problems (DOSA-MO), a novel multi-objective optimization wrapper algorithm that learns how the original estimation, its variance, and the feature set size of the solutions predict the overestimation. DOSA-MO adjusts the expectation of the performance during the optimization, improving the composition of the solution set. We verify that DOSA-MO improves the performance of a state-of-the-art genetic algorithm on left-out or external sample sets, when predicting cancer subtypes and/or patient overall survival, using three transcriptomics datasets for kidney and breast cancer.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11684899PMC
http://dx.doi.org/10.1093/bib/bbae674DOI Listing

Publication Analysis

Top Keywords

dual-stage optimizer
8
optimizer systematic
8
systematic overestimation
8
overestimation adjustment
8
genetic algorithms
8
biomarker panels
8
machine learning
8
feature selection
8
cancer subtypes
8
feature set
8

Similar Publications

Dual-stage excitation source improves the analytical sensitivity of miniaturized optical emission spectrometer.

Talanta

January 2025

Department of Chemistry, School of Forensic Medicine, China Medical University, Shenyang, 110122, China. Electronic address:

Miniaturized optical emission spectrometric (OES) devices based on various microplasma excitation sources provide a reliable tool for in-situ elemental analysis. The key to improving analytical performance is enhancing the excitation capability of the microplasma source in these devices. Here, dielectric barrier discharge (DBD) and point discharge (PD) technologies are combined to construct an enhanced dual-stage excitation source (called DBD-PD), which improves the overall excitation efficiency and OES signal sensitivity.

View Article and Find Full Text PDF

A dual-stage model for classifying Parkinson's disease severity, through a detailed analysis of Gait signals using force sensors and machine learning approaches, is proposed in this study. Parkinson's disease is the primary neurodegenerative disorder that results in a gradual reduction in motor function. Early detection and monitoring of the disease progression is highly challenging due to the gradual progression of symptoms and the inadequacy of conventional methods in identifying subtle changes in mobility.

View Article and Find Full Text PDF

Dual-stage optimizer for systematic overestimation adjustment applied to multi-objective genetic algorithms for biomarker selection.

Brief Bioinform

November 2024

School of Medicine, Institute of Biomedicine, University of Eastern Finland, Yliopistonranta 1, PO Box 1627, 70211 Kuopio, Finland.

The selection of biomarker panels in omics data, challenged by numerous molecular features and limited samples, often requires the use of machine learning methods paired with wrapper feature selection techniques, like genetic algorithms. They test various feature sets-potential biomarker solutions-to fine-tune a machine learning model's performance for supervised tasks, such as classifying cancer subtypes. This optimization process is undertaken using validation sets to evaluate and identify the most effective feature combinations.

View Article and Find Full Text PDF

"Pseudosubstrate Envelope"/Free Energy Perturbation-Guided Design and Mechanistic Investigations of Benzothiazole HIV Capsid Modulators with High Ligand Efficiency.

J Med Chem

November 2024

Department of Medicinal Chemistry, Key Laboratory of Chemical Biology (Ministry of Education), School of Pharmaceutical Sciences, Shandong University, 44 West Culture Road, Jinan, Shandong 250012, PR China.

Based on our proposed "pseudosubstrate envelope" concept, 25 benzothiazole-bearing HIV capsid protein (CA) modulators were designed and synthesized under the guidance of free energy perturbation technology. The most potent compound, , exhibited an EC of 2.69 nM against HIV-1, being 393 times more potent than the positive control PF74.

View Article and Find Full Text PDF

Dual-Stage Reduction Strategy of Tin Perovskite Enables High Performance Photovoltaics.

Angew Chem Int Ed Engl

January 2025

Institute of Functional Nano & Soft Materials (FUNSOM), Jiangsu Key Laboratory of Advanced Negative Carbon Technologies, Jiangsu Key Laboratory for Carbon-Based Functional Materials & Devices, Soochow University, Suzhou, 215123, China.

Article Synopsis
  • The study addresses the issue of rapid oxidation of tin (Sn) in tin-based perovskite solar cells (TPSCs), which limits their efficiency and stability.
  • A novel method was developed using thiosulfate ions in the precursor solution to facilitate a dual-stage reduction process that minimizes Sn oxidation and defects, enhancing device stability.
  • The resulting thiosulfate-incorporated solar cells achieved a notable efficiency of 14.78% and maintained 90% of their initial performance after 628 hours of testing.
View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!