Enzymes are molecular machines optimized by nature to allow otherwise impossible chemical processes to occur. Their design is a challenging task due to the complexity of the protein space and the intricate relationships between sequence, structure, and function. Recently, large language models (LLMs) have emerged as powerful tools for modeling and analyzing biological sequences, but their application to protein design is limited by the high cardinality of the protein space. This study introduces a framework that combines LLMs with genetic algorithms (GAs) to optimize enzymes. LLMs are trained on a large dataset of protein sequences to learn relationships between amino acid residues linked to structure and function. This knowledge is then leveraged by GAs to efficiently search for sequences with improved catalytic performance. We focused on two optimization tasks: improving the feasibility of biochemical reactions and increasing their turnover rate. Systematic evaluations on 105 biocatalytic reactions demonstrated that the LLM-GA framework generated mutants outperforming the wild-type enzymes in terms of feasibility in 90% of the instances. Further in-depth evaluation of seven reactions reveals the power of this methodology to make "the best of both worlds" and create mutants with structural features and flexibility comparable with the wild types. Our approach advances the state-of-the-art computational design of biocatalysts, ultimately opening opportunities for more sustainable chemical processes.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11711099 | PMC |
http://dx.doi.org/10.1093/bib/bbae675 | DOI Listing |
Medicine (Baltimore)
January 2025
Department of Otolaryngology, Hangzhou Red Cross Hospital (Zhejiang Hospital of Integrated Traditional Chinese and Western Medicine), Hangzhou, Zhejiang, China.
T-helper 17 (Th17) cells significantly influence the onset and advancement of malignancies. This study endeavor focused on delineating molecular classifications and developing a prognostic signature grounded in Th17 cell differentiation-related genes (TCDRGs) using machine learning algorithms in head and neck squamous cell carcinoma (HNSCC). A consensus clustering approach was applied to The Cancer Genome Atlas-HNSCC cohort based on TCDRGs, followed by an examination of differential gene expression using the limma package.
View Article and Find Full Text PDFPLoS One
January 2025
Faculty of Engineering, Free University of Bozen-Bolzano, Bolzano, South Tyrol, Italy.
Appraisal models, such as the Scherer's Component Process Model (CPM), represent an elegant framework for the interpretation of emotion processes, advocating for computational models that capture emotion dynamics. Today's emotion recognition research, however, typically classifies discrete qualities or categorised dimensions, neglecting the dynamic nature of emotional processes and thus limiting interpretability based on appraisal theory. In our research, we estimate emotion intensity from multiple physiological features associated to the CPM's neurophysiological component using dynamical models with the aim of bringing insights into the relationship between physiological dynamics and perceived emotion intensity.
View Article and Find Full Text PDFHypertension is a critical risk factor and cause of mortality in cardiovascular diseases, and it remains a global public health issue. Therefore, understanding its mechanisms is essential for treating and preventing hypertension. Gene expression data is an important source for obtaining hypertension biomarkers.
View Article and Find Full Text PDFImmun Inflamm Dis
January 2025
Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing, China.
Background: Long COVID, a heterogeneous condition characterized by a range of physical and neuropsychiatric presentations, can be presented with a proportion of COVID-19-infected individuals.
Methods: Transcriptomic data sets of those within gene expression profiles of COVID-19, long COVID, and healthy controls were retrieved from the GEO database. Differentially expressed genes (DEGs) falling under COVID-19 and long COVID were identified with R packages, and contemporaneously conducted module detection was performed with the Modular Pharmacology Platform (http://112.
J Clin Hypertens (Greenwich)
January 2025
College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fuzhou, Fujian, China.
Preeclampsia (PE) is a pregnancy-specific disorder characterized by an unclearly understood pathogenesis and poses a great threat to maternal and fetal safety. Cuproptosis, a novel form of cellular death, has been implicated in the advancement of various diseases. However, the role of cuproptosis and immune-related genes in PE is unclear.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!