Descriptor generation from Morgan fingerprint using persistent homology.

SAR QSAR Environ Res

Research Division of Polymer Functional Materials, Osaka Research Institute of Industrial Science and Technology, Izumi, Osaka, Japan.

Published: January 2024

In cheminformatics, molecular fingerprints (FPs) are used in various tasks such as regression and classification. However, predictive models often underutilize Morgan FP for regression and related tasks in machine learning. This study introduced descriptors derived from reshaped Morgan FPs using persistent homology for the predictive accuracy improvement. In the solvation free energy (FreeSolv) and water solubility (ESOL) datasets, persistent homology was found to enhance predictive accuracy compared to the use of only Morgan FPs. Notably, using the first-order persistence diagram (PD1) for descriptor generation resulted in more significant improvements than using the zeroth-order persistence diagram (PD0). Combining 4096 bits Morgan FPs with PD1-generated descriptors increased the average coefficient of determination in the Gaussian process regression from 0.597 to 0.667 for FreeSolv and from 0.629 to 0.654 for ESOL. Adjusting the grid size parameter during PD-based descriptor generation is crucial, as finer grids, especially with PD0, generate more descriptors but reduce predictive accuracy. Coarsening the grid or applying principal component analysis (PCA) mitigates overfitting and enhances accuracy. When descriptors were generated from Morgan FPs with randomly shuffled bit positions, coarsening the grid and/or applying PCA achieved similar accuracy improvements as when the persistent homology of the original Morgan FPs was used.

Download full-text PDF

Source
http://dx.doi.org/10.1080/1062936X.2023.2301327DOI Listing

Publication Analysis

Top Keywords

morgan fps
20
persistent homology
16
descriptor generation
12
predictive accuracy
12
persistence diagram
8
coarsening grid
8
morgan
7
fps
6
accuracy
5
generation morgan
4

Similar Publications

Descriptor generation from Morgan fingerprint using persistent homology.

SAR QSAR Environ Res

January 2024

Research Division of Polymer Functional Materials, Osaka Research Institute of Industrial Science and Technology, Izumi, Osaka, Japan.

In cheminformatics, molecular fingerprints (FPs) are used in various tasks such as regression and classification. However, predictive models often underutilize Morgan FP for regression and related tasks in machine learning. This study introduced descriptors derived from reshaped Morgan FPs using persistent homology for the predictive accuracy improvement.

View Article and Find Full Text PDF

Background: Limited research exists on the use of specific health services over an extended time among rural persons with dementia. The study objective was to examine health service use over a 10-year period, five years before until five years after diagnosis in the specialist Rural and Remote Memory Clinic (RRMC).

Methods: Clinical and administrative health data of RRMC patients were linked.

View Article and Find Full Text PDF

Objectives: This study investigated patterns in health service usage among older adults with dementia and matched controls over a 10-year span from 5 years before until 5 years after diagnosis.

Design: Population-based retrospective matched case-control study.

Setting: Administrative health data of individuals in Saskatchewan, Canada from 1 April 2008 to 31 March 2019.

View Article and Find Full Text PDF

One-Third of Family Physicians Remain in Independently Owned Practice, 2017-2019.

J Am Board Fam Med

October 2021

From the Mathematica, Oakland, CA (DRR); Department of Family and Community Medicine, University of California San Francisco, San Francisco, CA (DRR); American Board of Family Medicine, Lexington, KY (AWB, ZJM, LEP); Center for Professionalism and Value in Health Care, Washington, DC (AWB); Department of Family and Community Medicine, College of Medicine, University of Kentucky, Lexington, KY (LEP).

The rise of health system and hospital ownership of primary care practices raises policy questions about the survival of independent physician-owned practices. Our data indicate that a substantial proportion of FPs in 2017-2019 remained in independently owned practice: 81% of solo practitioners and 35% of FPs in practices with 2-5 clinicians. These findings suggest that independent practice is surviving, and that it's incumbent on researchers, payers, and policymakers to better understand their unique contributions and challenges in the effort to improve primary care access, quality, and cost.

View Article and Find Full Text PDF

Triclosan (TCS) and bisphenol analogues are used in a variety of consumer goods. Few data exist on the temporal exposures of adults to these phenolic compounds in their everyday diets. The objectives were to determine the levels of TCS and five bisphenol analogues (BPB, BPF, BPP, BPS, and BPZ) in duplicate-diet solid food (DDSF) samples of adults and to estimate maximum dietary exposures and intake doses per phenol.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!