The reliability of a QSAR classification model depends on its capacity to achieve confident predictions of new compounds not considered in the building of the model. The results of this external validation process show the applicability domain (AD) of the QSAR model and, therefore, the robustness of the model to predict the property/activity of new molecules. In this paper we propose the use of the rivality and modelability indexes for the study of the characteristics of the datasets to be correctly modeled by a QSAR algorithm and to predict the reliability of the built model to prognosticate the property/activity of new molecules. The calculation of these indexes has a very low computational cost, not requiring the building of a model, thus being good tools for the analysis of the datasets in the first stages of the building of QSAR classification models. In our study, we have selected two benchmark datasets with similar number of molecules but with very different modelability and we have corroborated the capacity of the predictability of the rivality and modelability indexes regarding the classification models built using Support Vector Machine and Random Forest algorithms with 5-fold cross-validation and leave-one-out techniques. The results have shown the excellent ability of both indexes to predict outliers and the applicability domain of the QSAR classification models. In all cases, these values accurately predicted the statistic parameters of the QSAR models generated by the algorithms.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6278359 | PMC |
http://dx.doi.org/10.3390/molecules23112756 | DOI Listing |
Environ Toxicol Chem
January 2025
School of Environmental Science and Engineering, Nanjing University of Information Science and Technology, Nanjing, PR China.
In silico methods are increasingly important in predicting the ecotoxicity of engineered nanomaterials (ENMs), encompassing both individual and mixture toxicity predictions. It is widely recognized that ENMs trigger oxidative stress effects by generating intracellular reactive oxygen species (ROS), serving as a key mechanism in their cytotoxicity studies. However, existing in silico methods still face significant challenges in predicting the oxidative stress effects induced by ENMs.
View Article and Find Full Text PDFSci Rep
January 2025
Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700 032, India.
We have adopted the classification Read-Across Structure-Activity Relationship (c-RASAR) approach in the present study for machine-learning (ML)-based model development from a recently reported curated dataset of nephrotoxicity potential of orally active drugs. We initially developed ML models using nine different algorithms separately on topological descriptors (referred to as simply "descriptors" in the subsequent sections of the manuscript) and MACCS fingerprints (referred to as "fingerprints" in the subsequent sections of the manuscript), thus generating 18 different ML QSAR models. Using the chemical spaces defined by the modeling descriptors and fingerprints, the similarity and error-based RASAR descriptors were computed, and the most discriminating RASAR descriptors were used to develop another set of 18 different ML c-RASAR models.
View Article and Find Full Text PDFACS Omega
December 2024
Computational Medicine Lab, Department of IT Convergence Engineering, Kumoh National Institute of Technology, Gumi 39177, Republic of Korea.
This study introduces an innovative computational approach using hybrid machine learning models to predict toxicity across eight critical end points: cardiac toxicity, inhalation toxicity, dermal toxicity, oral toxicity, skin irritation, skin sensitization, eye irritation, and respiratory irritation. Leveraging advanced cheminformatics tools, we extracted relevant features from curated data sets, incorporating a range of descriptors such as Morgan circular fingerprints, MACCS keys, Mordred calculation descriptors, and physicochemical properties. The consensus model was developed by selecting the best-performing classifier-Random Forest (RF), eXtreme Gradient Boosting (XGBoost), or Support Vector Machines (SVM)-for each descriptor, optimizing predictive accuracy and robustness across the end points.
View Article and Find Full Text PDFJ Cheminform
December 2024
Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milan, Italy.
Ensuring the safety of chemicals for environmental and human health involves assessing physicochemical (PC) and toxicokinetic (TK) properties, which are crucial for absorption, distribution, metabolism, excretion, and toxicity (ADMET). Computational methods play a vital role in predicting these properties, given the current trends in reducing experimental approaches, especially those that involve animal experimentation. In the present manuscript, twelve software tools implementing Quantitative Structure-Activity Relationship (QSAR) models were selected for the prediction of 17 relevant PC and TK properties.
View Article and Find Full Text PDFEnviron Health (Wash)
December 2024
Department of Environmental Science, Baylor University, Waco, Texas 76798-7266, United States.
Inhaled chemicals can cause dysfunction in the lung surfactant, a protein-lipid complex with critical biophysical and biochemical functions. This inhibition has many structure-related and dose-dependent mechanisms, making hazard identification challenging. We developed quantitative structure-activity relationships for predicting lung surfactant inhibition using machine learning.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!