Background: The virtual screening of large compound databases is an important application of structural-activity relationship models. Due to the high structural diversity of these data sets, it is impossible for machine learning based QSAR models, which rely on a specific training set, to give reliable results for all compounds. Thus, it is important to consider the subset of the chemical space in which the model is applicable. The approaches to this problem that have been published so far mostly use vectorial descriptor representations to define this domain of applicability of the model. Unfortunately, these cannot be extended easily to structured kernel-based machine learning models. For this reason, we propose three approaches to estimate the domain of applicability of a kernel-based QSAR model.
Results: We evaluated three kernel-based applicability domain estimations using three different structured kernels on three virtual screening tasks. Each experiment consisted of the training of a kernel-based QSAR model using support vector regression and the ranking of a disjoint screening data set according to the predicted activity. For each prediction, the applicability of the model for the respective compound is quantitatively described using a score obtained by an applicability domain formulation. The suitability of the applicability domain estimation is evaluated by comparing the model performance on the subsets of the screening data sets obtained by different thresholds for the applicability scores. This comparison indicates that it is possible to separate the part of the chemspace, in which the model gives reliable predictions, from the part consisting of structures too dissimilar to the training set to apply the model successfully. A closer inspection reveals that the virtual screening performance of the model is considerably improved if half of the molecules, those with the lowest applicability scores, are omitted from the screening.
Conclusion: The proposed applicability domain formulations for kernel-based QSAR models can successfully identify compounds for which no reliable predictions can be expected from the model. The resulting reduction of the search space and the elimination of some of the active compounds should not be considered as a drawback, because the results indicate that, in most cases, these omitted ligands would not be found by the model anyway.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2851576 | PMC |
http://dx.doi.org/10.1186/1758-2946-2-2 | DOI Listing |
J Comput Chem
January 2025
Departament de Farmàcia i Tecnologia Farmacèutica, i Fisicoquímica, Facultat de Farmàcia i Ciències de l'Alimentació, Universitat de Barcelona (UB), Barcelona, Spain.
Continuum solvation models such as the polarizable continuum model and the conductor-like screening model are widely used in quantum chemistry, but their application to large biosystems is hampered by their computational cost. Here, we report the parametrization of the Miertus-Scrocco-Tomasi (MST) model for the prediction of hydration free energies of neutral and ionic molecules based on the domain decomposition formulation of COSMO (ddCOSMO), which allows a drastic reduction of the computational cost by several orders of magnitude. We also introduce several novelties in MST, like a new definition of atom types based on hybridization and an automatic setup of the cavity for charged regions.
View Article and Find Full Text PDFJ Comput Chem
January 2025
Pharmaceutical Chemistry Research Laboratory 1, Department of Pharmaceutical Engineering & Technology, Indian Institute of Technology (Banaras Hindu University), Varanasi, India.
Cyclooxygenase-2 (COX-2) is an enzyme that plays a crucial role in inflammation by converting arachidonic acid into prostaglandins. The overexpression of enzyme is associated with conditions such as cancer, arthritis, and Alzheimer's disease (AD), where it contributes to neuroinflammation. In silico virtual screening is pivotal in early-stage drug discovery; however, the absence of coding or machine learning expertise can impede the development of reliable computational models capable of accurately predicting inhibitor compounds based on their chemical structure.
View Article and Find Full Text PDFJ Clin Med
December 2024
Department of Obstetrics & Gynecology, College of Medicine & Health Sciences (CMHS), United Arab Emirates University, Al Ain P.O. Box 15551, United Arab Emirates.
Preconception health is critical for improving maternal and child health. The main objective of the study was to explore medical students' health habits, quality of life, and knowledge of preconception healthcare. We conducted a cross-sectional study between 15 March 2023 and 31 May 2024 among medical students at United Arab Emirates University.
View Article and Find Full Text PDFJ Clin Med
December 2024
Department of Public Health and Sport Sciences, Faculty of Health and Life Sciences, Medical School, University of Exeter, Exeter EX1 2LU, UK.
: To summarize the impact of various telerehabilitation interventions on motor function, balance, gait, activities of daily living (ADLs), and quality of life (QoL) among patients with stroke and to determine the existing telerehabilitation interventions for delivering physiotherapy sessions in clinical practice. : Six electronic databases were searched to identify relevant quantitative systematic reviews (SRs). Due to substantial heterogeneity, the data were analysed narratively.
View Article and Find Full Text PDFSensors (Basel)
January 2025
School of Mechanical and Electrical Engineering, China Jiliang University, Hangzhou 310018, China.
Breast cancer (BC) is one of the most lethal cancers worldwide, and its early diagnosis is critical for improving patient survival rates. However, the extraction of key information from complex medical images and the attainment of high-precision classification present a significant challenge. In the field of signal processing, texture-rich images typically exhibit periodic patterns and structures, which are manifested as significant energy concentrations at specific frequencies in the frequency domain.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!