Robust Random Forest-Based All-Relevant Feature Ranks for Trustworthy AI.

Stud Health Technol Inform

Institute for Medical Informatics Statistics and Documentations, Medical University of Graz, Austria.

Published: May 2022

Feature selection is a fundamental challenge in machine learning. For instance in bioinformatics, it is essential when one wishes to detect biomarkers. Tree-based methods are predominantly used for this purpose. In this paper, we study the stability of the feature selection methods BORUTA, VITA, and RRF (regularized random forest). In particular, we investigate the feature ranking instability of the associated stochastic algorithms. For stabilization of the feature ranks, we propose to compute consensus values from multiple feature selection runs, applying rank aggregation techniques. Our results show that these consolidated features are more accurate and robust, which helps to make practical machine learning applications more trustworthy.

Download full-text PDF

Source
http://dx.doi.org/10.3233/SHTI220418DOI Listing

Publication Analysis

Top Keywords

feature selection
12
feature ranks
8
machine learning
8
feature
6
robust random
4
random forest-based
4
forest-based all-relevant
4
all-relevant feature
4
ranks trustworthy
4
trustworthy feature
4

Similar Publications

Discoidin domain receptors (DDR) are categorized under tyrosine kinase receptors (RTKs) and play a crucial role in various etiological conditions such as cancer, fibrosis, atherosclerosis, osteoarthritis, and inflammatory diseases. The structural domain rearrangement of DDR1 and DDR2 involved six domains of interest namely N-terminal DS, DS-like, intracellular juxtamembrane, transmembrane juxtamembrane, extracellular juxtamembrane intracellular kinase domain, and the tail portion contains small C-tail linkage. DDR has not been explored to a wide extent to be declared as a prime target for any particular pathological condition.

View Article and Find Full Text PDF

A new HCM heart sound classification method based on weighted bispectrum features.

Phys Eng Sci Med

January 2025

School of Electrical Engineering and Electronic Information, Xihua University, Chengdu, China.

Hypertrophic cardiomyopathy (HCM), including obstructive HCM and non-obstructive HCM, can lead to sudden cardiac arrest in adolescents and athletes. Early diagnosis and treatment through auscultation of different types of HCM can prevent the occurrence of malignant events. However, it is challenging to distinguish the pathological information of HCM related to differential left ventricular outflow tract pressure gradients.

View Article and Find Full Text PDF

Introduction And Hypothesis: This study aims to develop a postpartum stress urinary incontinence (PPSUI) risk prediction model based on an updated definition of PPSUI, using machine learning algorithms. The goal is to identify the best model for early clinical screening to improve screening accuracy and optimize clinical management strategies.

Methods: This prospective study collected data from 1208 postpartum women, with the dataset randomly divided into training and testing sets (8:2).

View Article and Find Full Text PDF

Ru(II)-Based Multitopic Hosts for Fullerene Binding: Impact of the Anion in the Recognition Process.

Inorg Chem

January 2025

GIR MIOMeT, IU CINQUIMA/Química Inorgánica, Facultad de Ciencias, Universidad de Valladolid, Valladolid E47011, Spain.

The development of multitopic hosts for fullerene recognition based on nonplanar corannulene (CH) structures presents challenges, primarily due to the requirement for synergistic interactions with multiple units of this polycyclic aromatic hydrocarbon. Moreover, increasing the number of corannulene groups in a single chemical structure while avoiding the cost of increasing flexibility has been scarcely explored. Herein, we report the synthesis of a family of multitopic Ru(II)-polypyridyl complexes bearing up to six units of corannulene arranged by pairs, offering a total of three molecular tweezers.

View Article and Find Full Text PDF

Background: Implantable cardioverter-defibrillators (ICDs) are essential for reducing sudden cardiac death in patients at risk of ventricular arrhythmias. The choice of ICD lead - single-coil or dual-coil - can influence device performance and patient outcomes. This meta-analysis evaluates the comparative efficacy and safety of single-coil versus dual-coil ICD leads to inform clinical decision-making.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!