Building Highly Reliable Quantitative Structure-Activity Relationship Classification Models Using the Rivality Index Neighborhood Algorithm with Feature Selection.

Irene Luque Ruiz Miguel Ángel Gómez-Nieto

J Chem Inf Model

Department of Computing and Numerical Analysis , University of Córdoba, Campus de Rabanales, Albert Einstein Building , E-14071 Córdoba , Spain .

Published: January 2020

Dimensionality reduction of the data set representation for the construction of the quantitative structure-activity relationship classification models is an important research subject for the interpretability of the models and the computational cost efficiency of the classification algorithms. Feature selection techniques are appropriate as only a short number of relevant features should be used in the classification process because irrelevant and redundant features should be discarded, the same as the noninterpretable ones. In this paper, we propose an embedded feature selection technique for the construction of classification models using the rivality index neighborhood (RINH) algorithm. This technique uses a filter selection in the preprocessing stage considering the selectivity of the features as a selection criterion and a wrapper technique in the processing stage based on the improvement of the accuracy and reliability of the models generated using the RINH algorithm with LTN and GTN functions. The results obtained using the RINH algorithm with and without the selection of features and compared with those results obtained using 14 machine learning algorithms have demonstrated that the feature selection technique proposed in this paper is capable of clearly building more accurate and reliable models, reducing the data dimensionality around 90%, and generating high robust and interpretable models.

Download full-text PDF	Source
http://dx.doi.org/10.1021/acs.jcim.9b00706	DOI Listing

Publication Analysis

Top Keywords

feature selection

classification models

rinh algorithm

quantitative structure-activity

structure-activity relationship

relationship classification

models rivality

rivality neighborhood

selection technique

models

Similar Publications

BaNDyT: Bayesian Network Modeling of Molecular Dynamics Trajectories.

J Chem Inf Model

January 2025

Department of Computational and Quantitative Medicine, Beckman Research Institute of the City of Hope, 1218 S 5th Ave, Monrovia, California 91016, United States.

Elizaveta Mukhaleva Babgen Manookian Hanyu Chen Indira R Sivaraj Ning Ma

Bayesian network modeling (BN modeling, or BNM) is an interpretable machine learning method for constructing probabilistic graphical models from the data. In recent years, it has been extensively applied to diverse types of biomedical data sets. Concurrently, our ability to perform long-time scale molecular dynamics (MD) simulations on proteins and other materials has increased exponentially.

View Article and Find Full Text PDF

Similar Publications

Evaluation of a Machine Learning-Guided Strategy for Elevated Lipoprotein(a) Screening in Health Systems.

Circ Genom Precis Med

January 2025

Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, CT (A.A., L.S.D., E.K.O., R.K.).

Arya Aminorroaya Lovedeep S Dhingra Evangelos K Oikonomou Rohan Khera

Background: While universal screening for Lp(a; lipoprotein[a]) is increasingly recommended, <0.5% of patients undergo Lp(a) testing. Here, we assessed the feasibility of deploying Algorithmic Risk Inspection for Screening Elevated Lp(a; ARISE), a validated machine learning tool, to health system electronic health records to increase the yield of Lp(a) testing.

View Article and Find Full Text PDF

Similar Publications

Construction of a poor prognosis prediction and visualization system for intracranial aneurysm endovascular intervention treatment based on an improved machine learning model.

Front Neurol

January 2025

Department of Neurology, The Third People's Hospital of Yibin, Yibin, China.

Chunyu Lei Anhui Fu Bin Li Shengfu Zhou Jun Liu

Objective: To evaluate the clinical utility of improved machine learning models in predicting poor prognosis following endovascular intervention for intracranial aneurysms and to develop a corresponding visualization system.

Methods: A total of 303 patients with intracranial aneurysms treated with endovascular intervention at four hospitals (FuShun County Zigong City People's Hospital, Nanchong Central Hospital, The Third People's Hospital of Yibin, The Sixth People's Hospital of Yibin) from January 2022 to September 2023 were selected. These patients were divided into a good prognosis group ( = 207) and a poor prognosis group ( = 96).

View Article and Find Full Text PDF

Similar Publications

Prognostic analysis of endovascular mechanical thrombectomy in stroke patients with acute internal carotid artery obstruction based on circle of Willis variation.

Front Neurol

January 2025

Department of Neurosurgery, Shaoxing People's Hospital, Shaoxing, China.

Tianlun Qiu Huagang Luo Wuqiao Bao

Objective: Endovascular mechanical thrombectomy (EVMT) is widely employed in patients with acute intracranial carotid artery occlusion (AIICAO). This study aimed to predict the outcomes of EVMT following AIICAO by utilizing anatomic classification of the circle of Willis and its relative position to the thrombus.

Methods: In this study, we retrospectively analyzed a cohort of 108 patients with AIICAO who underwent endovascular mechanical thrombectomy (EVMT) at Shaoxing People's Hospital.

View Article and Find Full Text PDF

Similar Publications

Multimodal machine learning for analysing multifactorial causes of disease-The case of childhood overweight and obesity in Mexico.

Front Public Health

January 2025

Karolinska Institutet, Department of Medicine Solna, Division of Clinical Epidemiology, Stockholm, Sweden.

Rosario Silva Sepulveda Magnus Boman

Background: Mexico has one of the highest global incidences of paediatric overweight and obesity. Public health interventions have shown only moderate success, possibly from relying on knowledge extracted using limited types of statistical data analysis methods.

Purpose: To explore if multimodal machine learning can enhance identifying predictive features from obesogenic environments and investigating complex disease or social patterns, using the Mexican National Health and Nutrition Survey.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!