In the realm of medicinal chemistry, the primary objective is to swiftly optimize a multitude of chemical properties of a set of compounds to yield a clinical candidate poised for clinical trials. In recent years, two computational techniques, machine learning (ML) and physics-based methods, have evolved substantially and are now frequently incorporated into the medicinal chemist's toolbox to enhance the efficiency of both hit optimization and candidate design. Both computational methods come with their own set of limitations, and they are often used independently of each other. ML's capability to screen extensive compound libraries expediently is tempered by its reliance on quality data, which can be scarce especially during early-stage optimization. Contrarily, physics-based approaches like free energy perturbation (FEP) are frequently constrained by low throughput and high cost by comparison; however, physics-based methods are capable of making highly accurate binding affinity predictions. In this study, we harnessed the strength of FEP to overcome data paucity in ML by generating virtual activity data sets which then inform the training of algorithms. Here, we show that ML algorithms trained with an FEP-augmented data set could achieve comparable predictive accuracy to data sets trained on experimental data from biological assays. Throughout the paper, we emphasize key mechanistic considerations that must be taken into account when aiming to augment data sets and lay the groundwork for successful implementation. Ultimately, the study advocates for the synergy of physics-based methods and ML to expedite the lead optimization process. We believe that the physics-based augmentation of ML will significantly benefit drug discovery, as these techniques continue to evolve.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11094716 | PMC |
http://dx.doi.org/10.1021/acs.jcim.4c00071 | DOI Listing |
PLoS Comput Biol
December 2024
School of Biological Sciences (SBS), Nanyang Technological University, Singapore, Singapore.
The 3D structure of RNA critically influences its functionality, and understanding this structure is vital for deciphering RNA biology. Experimental methods for determining RNA structures are labour-intensive, expensive, and time-consuming. Computational approaches have emerged as valuable tools, leveraging physics-based-principles and machine learning to predict RNA structures rapidly.
View Article and Find Full Text PDFPLoS One
January 2025
Geosciences Department, King Fahd University of Petroleum and Minerals (KFUPM), Dhahran, KSA.
Estimating seismic anisotropy parameters, such as Thomson's parameters, is crucial for investigating fractured and finely layered geological media. However, many inversion methods rely on complex physical models with initial assumptions, leading to non-reproducible estimates and subjective fracture interpretation. To address these limitations, this study utilizes machine learning methods: support vector regression, extreme gradient boost, multi-layer perceptron, and a convolutional neural network.
View Article and Find Full Text PDFLangmuir
January 2025
Materials Science and Engineering, Drexel University, 3141 Chestnut Street, Philadelphia, Pennsylvania 19104, United States.
The functional performance of a particulate thin film depends greatly on the particle distribution that forms during drying. In situ methods for monitoring the impact of different processing parameters on the distribution of particles currently require expensive and specialized equipment. This work addresses this gap by miniaturizing a geophysical prospecting method to thin-film applications.
View Article and Find Full Text PDFWiley Interdiscip Rev Comput Stat
May 2024
Department of Mathematics and Statistics, University of Central Oklahoma.
The discrete empirical interpolation method (DEIM) is well-established as a means of performing model order reduction in approximating solutions to differential equations, but it has also more recently demonstrated potential in performing data class detection through subset selection. Leveraging the singular value decomposition for dimension reduction, DEIM uses interpolatory projection to identify the representative rows and/or columns of a data matrix. This approach has been adapted to develop additional algorithms, including a CUR matrix factorization for performing dimension reduction while preserving the interpretability of the data.
View Article and Find Full Text PDFProteins
December 2024
Bijvoet Centre for Biomolecular Research, Faculty of Science-Chemistry, Utrecht University, Utrecht, The Netherlands.
The HADDOCK team participated in CAPRI rounds 47-55 as server, manual predictor, and scorers. Throughout these CAPRI rounds, we used a plethora of computational strategies to predict the structure of protein complexes. Of the 10 targets comprising 24 interfaces, we achieved acceptable or better models for 3 targets in the human category and 1 in the server category.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!