We recently published a high quality validation set for testing conformer generators, consisting of structures from both the PDB and the CSD (Hawkins, P. C. D. et al. J. Chem. Inf. Model. 2010, 50, 572.), and tested the performance of our conformer generator, OMEGA, on these sets. In the present publication, we focus on understanding the suitability of those data sets for validation and identifying and learning from OMEGA's failures. We compare, for the first time we are aware of, the coverage of the applicable property spaces between the validation data sets we used and the parent compound sets to determine if our data sets adequately sample these property spaces. We also introduce the concept of torsion fingerprinting and compare this method of dissimilation to the more traditional graph-centric diversification methods we used in our previous publication. To improve our ability to programmatically identify cases where the crystallographic conformation is not well reproduced computationally, we introduce a new metric to compare conformations, RMSTanimoto. This new metric is used alongside those from our previous publication to efficiently identify reproduction failures. We find RMSTanimoto to be particularly effective in identifying failures for the smallest molecules in our data sets. Analysis of the nature of these failures, particularly those for the CSD, sheds further light on the issue of strain in crystallographic structures. Some of the residual failure cases not resolved by simple changes in OMEGA's defaults present significant challenges to conformer generation engines like OMEGA and are a source of new avenues to further improve their performance, while others illustrate the pitfalls of validating against crystallographic ligand conformations, particularly those from the PDB.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1021/ci300314k | DOI Listing |
Sci Data
January 2025
Department of Ophthalmology, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
This study presents TOM500, a comprehensive multi-organ annotated orbital magnetic resonance imaging (MRI) dataset. It includes clinical data, T2-weighted MRI scans, and corresponding segmentations from 500 patients with thyroid eye disease (TED) during their initial visit. TED is a common autoimmune disorder with distinct orbital MRI features.
View Article and Find Full Text PDFSci Data
January 2025
DiSTAR, University of Naples "Federico II", 80126, via Vicinale Cupa Cintia 26, Naples, Italy.
We present a new database, EutherianCoP, of fossil mammals which lived globally from the Late Pleistocene to the Holocene. The database includes 13,972 fossil occurrences of 786 extant or recently extinct placental mammal species, plus 155,198 current occurrences for those of them which survived to the present. The occurrences are correlated with radiometric age information.
View Article and Find Full Text PDFActa Crystallogr F Struct Biol Commun
February 2025
Institute for Biochemistry and Biology, University of Potsdam, Am Neuen Palais 10, 14469 Potsdam, Germany.
Screening of cryo-EM samples is essential for the generation of high-resolution cryo-EM structures. Often, it is cumbersome to correlate the appearance of specific grid squares and micrograph quality. Here, CryoCrane (Correlate atlas and exposures), a visualization tool for cryo-EM screening data, is presented.
View Article and Find Full Text PDFSleep Breath
January 2025
Department of Respiratory and Critical Care Medicine, Medical School of Nantong University, Nantong Key Laboratory of Respiratory Medicine, Affiliated Hospital of Nantong University, Nantong, 226001, China.
Background: The pathophysiology of obstructive sleep apnea (OSA) and diabetes mellitus (DM) is still unknown, despite clinical reports linking the two conditions. After investigating potential roles for DM-related genes in the pathophysiology of OSA, our goal is to investigate the molecular significance of the condition. Machine learning is a useful approach to understanding complex gene expression data to find biomarkers for the diagnosis of OSA.
View Article and Find Full Text PDFJ Pers Soc Psychol
January 2025
Department of Psychology, University of Zurich.
Self-esteem and depressive symptoms are important predictors of a range of societally relevant outcomes and are theorized to influence each other reciprocally over time. However, existing research offers only a limited understanding of how their dynamics unfold across different timescales. Using three data sets with different temporal resolutions, we aimed to advance our understanding of the temporal unfolding of the reciprocal dynamics between self-esteem and depressive symptoms.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!