Evaluating the impact of multivariate imputation by MICE in feature selection.

Maritza Mera-Gaona Ursula Neumann Rubiel Vargas-Canas Diego M López

PLoS One

University of Cauca, Colombia, Popayán, Cauca, Colombia.

Published: November 2021

Unlabelled: Handling missing values is a crucial step in preprocessing data in Machine Learning. Most available algorithms for analyzing datasets in the feature selection process and classification or estimation process analyze complete datasets. Consequently, in many cases, the strategy for dealing with missing values is to use only instances with full data or to replace missing values with a mean, mode, median, or a constant value. Usually, discarding missing samples or replacing missing values by means of fundamental techniques causes bias in subsequent analyzes on datasets.

Aim: Demonstrate the positive impact of multivariate imputation in the feature selection process on datasets with missing values.

Results: We compared the effects of the feature selection process using complete datasets, incomplete datasets with missingness rates between 5 and 50%, and imputed datasets by basic techniques and multivariate imputation. The feature selection algorithms used are well-known methods. The results showed that the datasets imputed by multivariate imputation obtained the best results in feature selection compared to datasets imputed by basic techniques or non-imputed incomplete datasets.

Conclusions: Considering the results obtained in the evaluation, applying multivariate imputation by MICE reduces bias in the feature selection process.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8318311	PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0254720	PLOS

Publication Analysis

Top Keywords

feature selection

multivariate imputation

missing values

selection process

impact multivariate

imputation mice

datasets

complete datasets

imputation feature

basic techniques

Similar Publications

A novel MRI-based radiomics for preoperative prediction of lymphovascular invasion in rectal cancer.

Abdom Radiol (NY)

January 2025

Department of Radiology, Taizhou Municipal Hospital, Taizhou, Zhejiang, China.

Xiaoxiang Ning Dengfa Yang Weiqun Ao Yuwen Guo Li Ding

Background: To develop and validate a clinical-radiomics model for preoperative prediction of lymphovascular invasion (LVI) in rectal cancer.

Methods: This retrospective study included data from 239 patients with pathologically confirmed rectal adenocarcinoma from two centers, all of whom underwent MRI examinations. Cases from the first center (n = 189) were randomly divided into a training set and an internal validation set at a 7:3 ratio, while cases from the second center (n = 50) constituted the external validation set.

View Article and Find Full Text PDF

Similar Publications

nipalsMCIA: Flexible Multi-Block Dimensionality Reduction in R via Nonlinear Iterative Partial Least Squares.

Bioinformatics

January 2025

Department of Pathology and Department of Immunobiology, Yale School of Medicine.

Max Mattessich Joaquin Reyna Edel Aron Ferhat Ay Misha Kilmer

Summary: With the increased reliance on multi-omics data for bulk and single cell analyses, the availability of robust approaches to perform unsupervised learning for clustering, visualization, and feature selection is imperative. We introduce nipalsMCIA, an implementation of multiple co-inertia analysis (MCIA) for joint dimensionality reduction that solves the objective function using an extension to Non-linear Iterative Partial Least Squares (NIPALS). We applied nipalsMCIA to both bulk and single cell datasets and observed significant speed-up over other implementations for data with a large sample size and/or feature dimension.

View Article and Find Full Text PDF

Similar Publications

A prediction study on the occurrence risk of heart disease in older hypertensive patients based on machine learning.

BMC Geriatr

January 2025

Department of Cardiology, The Second Hospital & Clinical Medical School, Lanzhou University, No. 82 Cuiyingmen, Lanzhou, 730000, China.

Fei Si Qian Liu Jing Yu

Objective: Constructing a predictive model for the occurrence of heart disease in elderly hypertensive individuals, aiming to provide early risk identification.

Methods: A total of 934 participants aged 60 and above from the China Health and Retirement Longitudinal Study with a 7-year follow-up (2011-2018) were included. Machine learning methods (logistic regression, XGBoost, DNN) were employed to build a model predicting heart disease risk in hypertensive patients.

View Article and Find Full Text PDF

Similar Publications

Identification of biomarkers for knee osteoarthritis through clinical data and machine learning models.

Sci Rep

January 2025

Department of Orthopaedics, Traditional Chinese Medical Hospital of Gansu Province, Qilihe District, Guazhou Street 418, Lanzhou, 730050,, Gansu, China.

Wei Chen Haotian Zheng Binglin Ye Tiefeng Guo Yude Xu

Knee osteoarthritis (KOA) represents a progressive degenerative disorder characterized by the gradual erosion of articular cartilage. This study aimed to develop and validate biomarker-based predictive models for KOA diagnosis using machine learning techniques. Clinical data from 2594 samples were obtained and stratified into training and validation datasets in a 7:3 ratio.

View Article and Find Full Text PDF

Similar Publications

Harnessing advanced hybrid deep learning model for real-time detection and prevention of man-in-the-middle cyber attacks.

Sci Rep

January 2025

Department of Electronics and Communication Engineering, Panimalar Engineering College, Chennai, India.

V Kandasamy A Ameelia Roseline

The growing number of connected devices in smart home environments has amplified security risks, particularly from Man-in-the-Middle (MitM) attacks. These attacks allow cybercriminals to intercept and manipulate communication streams between devices, often remaining undetected. Traditional rule-based methods struggle to cope with the complexity of these attacks, creating a need for more advanced, adaptive intrusion detection systems.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!