Large language models have greatly enhanced our ability to understand biology and chemistry, yet robust methods for structure-based drug discovery, quantum chemistry and structural biology are still sparse. Precise biomolecule-ligand interaction datasets are urgently needed for large language models. To address this, we present MISATO, a dataset that combines quantum mechanical properties of small molecules and associated molecular dynamics simulations of ~20,000 experimental protein-ligand complexes with extensive validation of experimental data. Starting from the existing experimental structures, semi-empirical quantum mechanics was used to systematically refine these structures. A large collection of molecular dynamics traces of protein-ligand complexes in explicit water is included, accumulating over 170 μs. We give examples of machine learning (ML) baseline models proving an improvement of accuracy by employing our data. An easy entry point for ML experts is provided to enable the next generation of drug discovery artificial intelligence models.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11136668PMC
http://dx.doi.org/10.1038/s43588-024-00627-2DOI Listing

Publication Analysis

Top Keywords

protein-ligand complexes
12
drug discovery
12
machine learning
8
structure-based drug
8
large language
8
language models
8
molecular dynamics
8
misato machine
4
learning dataset
4
dataset protein-ligand
4

Similar Publications

Protein catalysis and allostery require the atomic-level orchestration and motion of residues and ligand, solvent and protein effector molecules. However, the ability to design protein activity through precise protein-solvent cooperative interactions has not yet been demonstrated. Here we report the design of 14 membrane receptors that catalyse G protein nucleotide exchange through diverse engineered allosteric pathways mediated by cooperative networks of intraprotein, protein-ligand and -solvent molecule interactions.

View Article and Find Full Text PDF

Deep learning methods for proteome-scale interaction prediction.

Curr Opin Struct Biol

January 2025

Department of Biological Sciences, Seoul National University, Seoul 08826, Republic of Korea. Electronic address:

Proteome-scale interaction prediction is essential for understanding protein functions and disease mechanisms. Traditional experimental methods are often limited by scale and complexity, driving the need for computational approaches. Deep learning has emerged as a powerful tool, enabling high-throughput, accurate predictions of protein interactions.

View Article and Find Full Text PDF

Background: Cadaverine and hydrocinnamic acid are frequent metabolites in inflamed periodontal areas. Their role as a metabolite for plant growth inhibition has been established, but their relevance in humans has yet to be determined. Moreover, Vascular endothelial growth factor (VGEF) is a consistent growth factor in neo-angiogenesis in periodontal regeneration.

View Article and Find Full Text PDF

Accurate prediction of ligand-receptor binding affinity is crucial in structure-based drug design, significantly impacting the development of effective drugs. Recent advances in machine learning (ML)-based scoring functions have improved these predictions, yet challenges remain in modeling complex molecular interactions. This study introduces the AGL-EAT-Score, a scoring function that integrates extended atom-type multiscale weighted colored subgraphs with algebraic graph theory.

View Article and Find Full Text PDF

The COVID-19 pandemic caused by SARS-CoV-2 continues to pose a major challenge to global health. Targeting the main protease of the virus (Mpro), which is essential for viral replication and transcription, offers a promising approach for therapeutic intervention. In this study, advanced computational techniques such as molecular docking and molecular dynamics simulations were used to screen a series of antiviral compounds for their potential inhibitory effect on the SARS-CoV-2 Mpro.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!