Descriptor-augmented machine learning for enzyme-chemical interaction predictions.

Synth Syst Biotechnol

Department of Chemical Engineering, Tsinghua University, Beijing, 100084, China.

Published: June 2024

Descriptors play a pivotal role in enzyme design for the greener synthesis of biochemicals, as they could characterize enzymes and chemicals from the physicochemical and evolutionary perspective. This study examined the effects of various descriptors on the performance of Random Forest model used for enzyme-chemical relationships prediction. We curated activity data of seven specific enzyme families from the literature and developed the pipeline for evaluation the machine learning model performance using 10-fold cross-validation. The influence of protein and chemical descriptors was assessed in three scenarios, which were predicting the activity of unknown relations between known enzymes and known chemicals (new relationship evaluation), predicting the activity of novel enzymes on known chemicals (new enzyme evaluation), and predicting the activity of new chemicals on known enzymes (new chemical evaluation). The results showed that protein descriptors significantly enhanced the classification performance of model on new enzyme evaluation in three out of the seven datasets with the greatest number of enzymes, whereas chemical descriptors appear no effect. A variety of sequence-based and structure-based protein descriptors were constructed, among which the esm-2 descriptor achieved the best results. Using enzyme families as labels showed that descriptors could cluster proteins well, which could explain the contributions of descriptors to the machine learning model. As a counterpart, in the new chemical evaluation, chemical descriptors made significant improvement in four out of the seven datasets, while protein descriptors appear no effect. We attempted to evaluate the generalization ability of the model by correlating the statistics of the datasets with the performance of the models. The results showed that datasets with higher sequence similarity were more likely to get better results in the new enzyme evaluation and datasets with more enzymes were more likely beneficial from the protein descriptor strategy. This work provides guidance for the development of machine learning models for specific enzyme families.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10915406PMC
http://dx.doi.org/10.1016/j.synbio.2024.02.006DOI Listing

Publication Analysis

Top Keywords

machine learning
16
enzymes chemicals
12
enzyme families
12
chemical descriptors
12
predicting activity
12
enzyme evaluation
12
protein descriptors
12
descriptors
10
specific enzyme
8
learning model
8

Similar Publications

A prediction model for electrical strength of gaseous medium based on molecular reactivity descriptors and machine learning method.

J Mol Model

January 2025

Hubei Key Laboratory·for High-Efficiency-Utilization of Solar Energy and Operation, Control of Energy-Storage System, Hubei-University of Technology, Wuhan, 430068, China.

Context: Ionization and adsorption in gas discharge are similar to electrophilic and nucleophilic reactions. The molecular descriptors characterizing reactions such as electrostatic potential descriptors are useful in predicting the electrical strength of environmentally friendly gases. In this study, descriptors of 73 molecules are employed for correlation analysis with electrical strength.

View Article and Find Full Text PDF

Predicting fall parameters from infant skull fractures using machine learning.

Biomech Model Mechanobiol

January 2025

Department of Mechanical Engineering, University of Utah, Salt Lake City, UT, 84112, USA.

When infants are admitted to the hospital with skull fractures, providers must distinguish between cases of accidental and abusive head trauma. Limited information about the incident is available in such cases, and witness statements are not always reliable. In this study, we introduce a novel, data-driven approach to predict fall parameters that lead to skull fractures in infants in order to aid in determinations of abusive head trauma.

View Article and Find Full Text PDF

Role of immune cell homeostasis in research and treatment response in hepatocellular carcinoma.

Clin Exp Med

January 2025

Department of Thoracic Surgery, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, China.

Introduction Recently, immune cells within the tumor microenvironment (TME) have become crucial in regulating cancer progression and treatment responses. The dynamic interactions between tumors and immune cells are emerging as a promising strategy to activate the host's immune system against various cancers. The development and progression of hepatocellular carcinoma (HCC) involve complex biological processes, with the role of the TME and tumor phenotypes still not fully understood.

View Article and Find Full Text PDF

The brain undergoes atrophy and cognitive decline with advancing age. The utilization of brain age prediction represents a pioneering methodology in the examination of brain aging. This study aims to develop a deep learning model with high predictive accuracy and interpretability for brain age prediction tasks.

View Article and Find Full Text PDF

Risk-taking is a concerning yet prevalent issue during adolescence and can be life-threatening. Examining its etiological sources and evolving pathways helps inform strategies to mitigate adolescents' risk-taking behavior. Studies have found that unfavorable environmental factors, such as adverse childhood experiences (ACEs), are associated with momentary levels of risk-taking in adolescents, but little is known about whether ACEs shape the developmental trajectory of risk-taking.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!