Study of the Applicability Domain of the QSAR Classification Models by Means of the Rivality and Modelability Indexes.

Irene Luque Ruiz Miguel Ángel Gómez-Nieto

Molecules

Department of Computing and Numerical Analysis, Campus Universitario de Rabanales, Albert Einstein Building, University of Córdoba, E-14071 Córdoba, Spain.

Published: October 2018

The reliability of a QSAR classification model depends on its capacity to achieve confident predictions of new compounds not considered in the building of the model. The results of this external validation process show the applicability domain (AD) of the QSAR model and, therefore, the robustness of the model to predict the property/activity of new molecules. In this paper we propose the use of the rivality and modelability indexes for the study of the characteristics of the datasets to be correctly modeled by a QSAR algorithm and to predict the reliability of the built model to prognosticate the property/activity of new molecules. The calculation of these indexes has a very low computational cost, not requiring the building of a model, thus being good tools for the analysis of the datasets in the first stages of the building of QSAR classification models. In our study, we have selected two benchmark datasets with similar number of molecules but with very different modelability and we have corroborated the capacity of the predictability of the rivality and modelability indexes regarding the classification models built using Support Vector Machine and Random Forest algorithms with 5-fold cross-validation and leave-one-out techniques. The results have shown the excellent ability of both indexes to predict outliers and the applicability domain of the QSAR classification models. In all cases, these values accurately predicted the statistic parameters of the QSAR models generated by the algorithms.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6278359	PMC
http://dx.doi.org/10.3390/molecules23112756	DOI Listing

Publication Analysis

Top Keywords

qsar classification

classification models

applicability domain

domain qsar

rivality modelability

modelability indexes

building model

property/activity molecules

qsar

model

Similar Publications

Integrating machine learning and nano-QSAR models to predict the oxidative stress potential caused by single and mixed carbon nanomaterials in algal cells.

Environ Toxicol Chem

January 2025

School of Environmental Science and Engineering, Nanjing University of Information Science and Technology, Nanjing, PR China.

Qi Qi Zhuang Wang

In silico methods are increasingly important in predicting the ecotoxicity of engineered nanomaterials (ENMs), encompassing both individual and mixture toxicity predictions. It is widely recognized that ENMs trigger oxidative stress effects by generating intracellular reactive oxygen species (ROS), serving as a key mechanism in their cytotoxicity studies. However, existing in silico methods still face significant challenges in predicting the oxidative stress effects induced by ENMs.

View Article and Find Full Text PDF

Similar Publications

Machine learning assisted classification RASAR modeling for the nephrotoxicity potential of a curated set of orally active drugs.

Sci Rep

January 2025

Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700 032, India.

Arkaprava Banerjee Kunal Roy

We have adopted the classification Read-Across Structure-Activity Relationship (c-RASAR) approach in the present study for machine-learning (ML)-based model development from a recently reported curated dataset of nephrotoxicity potential of orally active drugs. We initially developed ML models using nine different algorithms separately on topological descriptors (referred to as simply "descriptors" in the subsequent sections of the manuscript) and MACCS fingerprints (referred to as "fingerprints" in the subsequent sections of the manuscript), thus generating 18 different ML QSAR models. Using the chemical spaces defined by the modeling descriptors and fingerprints, the similarity and error-based RASAR descriptors were computed, and the most discriminating RASAR descriptors were used to develop another set of 18 different ML c-RASAR models.

View Article and Find Full Text PDF

Similar Publications

QSAR Classification Modeling Using Machine Learning with a Consensus-Based Approach for Multivariate Chemical Hazard End Points.

ACS Omega

December 2024

Computational Medicine Lab, Department of IT Convergence Engineering, Kumoh National Institute of Technology, Gumi 39177, Republic of Korea.

Yunendah Nur Fuadah Muhammad Adnan Pramudito Lulu Firdaus Frederique J Vanheusden Ki Moo Lim

This study introduces an innovative computational approach using hybrid machine learning models to predict toxicity across eight critical end points: cardiac toxicity, inhalation toxicity, dermal toxicity, oral toxicity, skin irritation, skin sensitization, eye irritation, and respiratory irritation. Leveraging advanced cheminformatics tools, we extracted relevant features from curated data sets, incorporating a range of descriptors such as Morgan circular fingerprints, MACCS keys, Mordred calculation descriptors, and physicochemical properties. The consensus model was developed by selecting the best-performing classifier-Random Forest (RF), eXtreme Gradient Boosting (XGBoost), or Support Vector Machines (SVM)-for each descriptor, optimizing predictive accuracy and robustness across the end points.

View Article and Find Full Text PDF

Similar Publications

Comprehensive benchmarking of computational tools for predicting toxicokinetic and physicochemical properties of chemicals.

J Cheminform

December 2024

Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milan, Italy.

Domenico Gadaleta Eva Serrano-Candelas Rita Ortega-Vallbona Erika Colombo Marina Garcia de Lomana

Ensuring the safety of chemicals for environmental and human health involves assessing physicochemical (PC) and toxicokinetic (TK) properties, which are crucial for absorption, distribution, metabolism, excretion, and toxicity (ADMET). Computational methods play a vital role in predicting these properties, given the current trends in reducing experimental approaches, especially those that involve animal experimentation. In the present manuscript, twelve software tools implementing Quantitative Structure-Activity Relationship (QSAR) models were selected for the prediction of 17 relevant PC and TK properties.

View Article and Find Full Text PDF

Similar Publications

Evaluation of Machine Learning Based QSAR Models for the Classification of Lung Surfactant Inhibitors.

Environ Health (Wash)

December 2024

Department of Environmental Science, Baylor University, Waco, Texas 76798-7266, United States.

James Y Liu Joshua Peeples Christie M Sayes

Inhaled chemicals can cause dysfunction in the lung surfactant, a protein-lipid complex with critical biophysical and biochemical functions. This inhibition has many structure-related and dose-dependent mechanisms, making hazard identification challenging. We developed quantitative structure-activity relationships for predicting lung surfactant inhibition using machine learning.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!