The One-Class Classification Approach to Data Description and to Models Applicability Domain.

Mol Inform

Laboratoire d'Infochimie, UMR 7177 CNRS, Université de Strasbourg, 4, rue B. Pascal, Strasbourg 67000, France.

Published: September 2010

In this paper, we associate an applicability domain (AD) of QSAR/QSPR models with the area in the input (descriptor) space in which the density of training data points exceeds a certain threshold. It could be proved that the predictive performance of the models (built on the training set) is larger for the test compounds inside the high density area, than for those outside this area. Instead of searching a decision surface separating high and low density areas in the input space, the one-class classification 1-SVM approach looks for a hyperplane in the associated feature space. Unlike other reported in the literature AD definitions, this approach: (i) is purely "data-based", i.e. it assigns the same AD to all models built on the same training set, (ii) provides results that depend only on the initial descriptors pool generated for the training set, (iii) can be used for the huge number of descriptors, as well as in the framework of structured kernel-based approaches, e.g., chemical graph kernels. The developed approach has been applied to improve the performance of QSPR models for stability constants of the complexes of organic ligands with alkaline-earth metals in water.

Download full-text PDF

Source
http://dx.doi.org/10.1002/minf.201000063DOI Listing

Publication Analysis

Top Keywords

training set
12
one-class classification
8
applicability domain
8
models built
8
built training
8
models
5
approach
4
classification approach
4
approach data
4
data description
4

Similar Publications

Clinical Outcomes in A Multi-center Cohort Involving 919 Patients with Hypertriglyceridemia-associated Acute Pancreatitis.

Am J Gastroenterol

January 2025

Center for Biomarker Discovery and Validation, National Infrastructures for Translational Medicine (PUMCH), Institute of Clinical Medicine, Peking Union Medical College Hospital, Beijing, China.

Objectives: Hypertriglyceridemia-associated acute pancreatitis (HTG-AP) is one of the most common etiologies of acute pancreatitis (AP) worldwide. Compared to other etiologies, patients with HTG-AP may develop more severe AP, but previous studies yielded controversial conclusion due to the lack of adequate adjustment for the confounders. Therefore, this study aimed to examine the possibility and risk factors of developing severe AP in HTG-AP.

View Article and Find Full Text PDF

Objective: To develop and validate a nomogram model for predicting central venous catheter-related infections (CRI) in patients with maintenance hemodialysis (MHD).

Methods: MHD patients with central venous catheters (CVCs) visiting the outpatient hemodialysis (HD) center of Xuzhou Medical University Affiliated Hospital from January 2020 to December 2023 were retrospectively selected through a HD monitoring system. Patient data were collected, and the patients were divided into training and validation sets in a 7:3 ratio.

View Article and Find Full Text PDF

IntroductionAsthma attacks are set off by triggers such as pollutants from the environment, respiratory viruses, physical activity and allergens. The aim of this research is to create a machine learning model using data from mobile health technology to predict and appropriately warn a patient to avoid such triggers.MethodsLightweight machine learning models, XGBoost, Random Forest, and LightGBM were trained and tested on cleaned asthma data with a 70-30 train-test split.

View Article and Find Full Text PDF

Objective: Whereas a scalp electroencephalogram (EEG) is important for diagnosing epilepsy, a single routine EEG is limited in its diagnostic value. Only a small percentage of routine EEGs show interictal epileptiform discharges (IEDs) and overall misdiagnosis rates of epilepsy are 20% to 30%. We aim to demonstrate how network properties in EEG recordings can be used to improve the speed and accuracy differentiating epilepsy from mimics, such as functional seizures - even in the absence of IEDs.

View Article and Find Full Text PDF

Background And Purpose: Radiation-induced lymphopenia (RIL) may be associated with a worse prognosis in pancreatic cancer. This study aimed to develop a normal tissue complication probability (NTCP) model to predict severe RIL in patients with pancreatic cancer undergoing concurrent chemoradiotherapy (CCRT).

Materials And Methods: We reviewed pancreatic cancer patients treated at our facility for model training and internal validation.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!