Objectives: The aim of this study was to investigate how to improve predictions from Gaussian Process models by optimising the model hyperparameters.

Methods: Optimisation methods, including Grid Search, Conjugate Gradient, Random Search, Evolutionary Algorithm and Hyper-prior, were evaluated and applied to previously published data. Data sets were also altered in a structured manner to reduce their size, which retained the range, or 'chemical space' of the key descriptors to assess the effect of the data range on model quality.

Key Findings: The Hyper-prior Smoothbox kernel results in the best models for the majority of data sets, and they exhibited significantly better performance than benchmark quantitative structure-permeability relationship (QSPR) models. When the data sets were systematically reduced in size, the different optimisation methods generally retained their statistical quality, whereas benchmark QSPR models performed poorly.

Conclusions: The design of the data set, and possibly also the approach to validation of the model, is critical in the development of improved models. The size of the data set, if carefully controlled, was not generally a significant factor for these models and that models of excellent statistical quality could be produced from substantially smaller data sets.

Download full-text PDF

Source
http://dx.doi.org/10.1111/jphp.12863DOI Listing

Publication Analysis

Top Keywords

data sets
20
data
9
gaussian process
8
optimisation methods
8
qspr models
8
statistical quality
8
data set
8
models
7
sets
5
model
4

Similar Publications

Is it time for a paper mill blacklist?

Naunyn Schmiedebergs Arch Pharmacol

January 2025

Independent researcher, Ikenobe 3011-2, Miki-cho, Kagawa-ken, 761-0799, Japan.

Paper mills represent one of science's greatest threats to the integrity of the entire scientific enterprise because they have become entrenched in a culture of the commercialization and corruption of science's assets, whether these be authorships, data sets, entire papers, editorial positions, or influence during editorial processes to favor a culture of unfair publication practices. This journal, which has taken proactive and exemplary steps to deal with this plague of fakery, is no stranger to the workings of such academic criminality, as exemplified by a string of retractions resulting from paper mill interference and association. This letter posits that a public database, and blacklist, of known paper mills is needed, as well as of authors who have a track record of using paper mills, but recognizes that the establishment of such a blacklist may pose practical, legal, and ethical challenges to its implementation and maintenance.

View Article and Find Full Text PDF

Objective: To assess whether social determinants of health (SDOHs) are associated with the first antiseizure medication (ASM) prescribed for newly diagnosed epilepsy.

Methods: The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) standards were followed, and the protocol registered (CRD42023448998). Embase, Medline, and Web of Science were searched up to July 31, 2023.

View Article and Find Full Text PDF

The contribution of sampling to the combined uncertainty of measurement is assessed using a combination of literature review and experimental determination of sampling variability in a range of foodstuffs in order to determine whether there is a consistent relationship between analyte level and proportion of variation attributable to sampling. Experimental determinations used the duplicate method, an economical method of assessing the relative contributions of sampling and analytical variability to the overall variance of results. The experimental work covered sampling of retail foodstuffs.

View Article and Find Full Text PDF

Efforts to understand and respond to the opioid crisis have focused on overdose fatalities. Overdose mortality rates (ratios of overdoses resulting in death) are rarely examined though they are important indicators of harm reduction effectiveness. Factors that vary across urban communities likely determine which community members are receiving the resources needed to reduce fatal overdose risk.

View Article and Find Full Text PDF

Proper personal protective equipment (PPE) use is critical to prevent disease transmission to healthcare providers, especially those treating patients with a high infection risk. To address the challenge of monitoring PPE usage in healthcare, computer vision has been evaluated for tracking adherence. Existing datasets for this purpose, however, lack a diversity of PPE and nonadherence classes, represent single not multiple providers, and do not depict dynamic provider movement during patient care.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!