Chemistrees: Data-Driven Identification of Reaction Pathways Machine Learning.

J Chem Theory Comput

Department of Informatics, UiO, Gaustadalléen 23B, 0373 Oslo, Norway.

Published: October 2021

We propose to analyze molecular dynamics (MD) output a supervised machine learning (ML) algorithm, the decision tree. The approach aims to identify the predominant geometric features which correlate with trajectories that transition between two arbitrarily defined states. The data-driven algorithm aims to identify these features without the bias of human "chemical intuition". We demonstrate the method by analyzing the proton exchange reactions in formic acid solvated in small water clusters. The simulations were performed with MD combined with a method to efficiently sample the rare event, path sampling. Our ML analysis identified relevant geometric variables involved in the proton transfer reaction and how they may change as the number of solvating water molecules changes.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8515787PMC
http://dx.doi.org/10.1021/acs.jctc.1c00458DOI Listing

Publication Analysis

Top Keywords

machine learning
8
aims identify
8
chemistrees data-driven
4
data-driven identification
4
identification reaction
4
reaction pathways
4
pathways machine
4
learning propose
4
propose analyze
4
analyze molecular
4

Similar Publications

Identification of circadian rhythm-related biomarkers and development of diagnostic models for Crohn's disease using machine learning algorithms.

Comput Methods Biomech Biomed Engin

January 2025

Department of Gastroenterolgy, The Second Affiliated Hospital of Chengdu Medical College, China National Nuclear Corporation 416 Hospital, Chengdu, China.

The global rise in Crohn's Disease (CD) incidence has intensified diagnostic challenges. This study identified circadian rhythm-related biomarkers for CD using datasets from the GEO database. Differentially expressed genes underwent Weighted Gene Co-Expression Network Analysis, with 49 hub genes intersected from GeneCards data.

View Article and Find Full Text PDF

Context.—: Generative artificial intelligence (AI) has emerged as a transformative force in various fields, including anatomic pathology, where it offers the potential to significantly enhance diagnostic accuracy, workflow efficiency, and research capabilities.

Objective.

View Article and Find Full Text PDF

In this research, a green approach utilizing deep eutectic solvent liquid-liquid microextraction is combined with smartphone digital image colorimetry for the determination of boron in nut samples. A smartphone camera was used to capture the image of the analyte extract located in a custom-made colorimetric box. Using ImageJ software, the images were split into RGB channels, with the green channel identified as the optimum.

View Article and Find Full Text PDF

Assessing water quality restoration measures in Lake Pampulha (Brazil) through remote sensing imagery.

Environ Sci Pollut Res Int

January 2025

LEESU, Ecole des Ponts Paris Tech, UPEC, AgroParisTech, F-77455 Marne-la-Vallée, Paris, France.

Urban reservoirs are frequently exposed to impacts from high population density, polluting activities, and the absence of environmental control measures and monitoring. In this study, we investigated the use of satellite imagery to assess restoration measures and support decision-making in a hypereutrophic urban reservoir. Since 2016, Lake Pampulha (Brazil) has undergone restoration measures, including the application of Phoslock®, to mitigate its poor water quality conditions.

View Article and Find Full Text PDF

Objective: Despite the identification of various prognostic factors for anaplastic thyroid carcinoma (ATC) patients over the years, a precise prognostic tool for these patients is still lacking. This study aimed to develop and validate a prognostic model for predicting survival outcomes for ATC patients using random survival forests (RSF), a machine learning algorithm.

Methods: A total of 1222 ATC patients were extracted from the Surveillance, Epidemiology, and End Results (SEER) database and randomly divided into a training set of 855 patients and a validation set of 367 patients.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!