Matcher: An Open-Source Application for Translating Large Structure/Property Data Sets into Insights for Drug Design.

J Chem Inf Model

R&D Informatics Solutions, MSD Czech Republic s.r.o., Prague 150 00, Czech Republic.

Published: April 2023

To solve recurring problems in drug discovery, matched molecular pair (MMP) analysis is used to understand relationships between chemical structure and function. For the MMP analysis of large data sets (>10,000 compounds), available tools lack flexible search and visualization functionality and require computational expertise. Here, we present Matcher, an open-source application for MMP analysis, with novel search algorithms and fully automated querying-to-visualization that requires no programming expertise. Matcher enables unprecedented control over the search and clustering of MMP transformations based on both variable fragment and constant environment structure, which is critical for disentangling relevant and irrelevant data to a given problem. Users can exert such control through a built-in chemical sketcher and with a few mouse clicks can navigate between resulting MMP transformations, statistics, property distribution graphs, and structures with raw experimental data, for confident and accelerated decision making. Matcher can be used with any collection of structure/property data; here, we demonstrate usage with a public ChEMBL data set of about 20,000 small molecules with CYP3A4 and/or hERG inhibition data. Users can reproduce all examples demonstrated herein via unique links within Matcher's interface-a functionality that anyone can use to preserve and share their own analyses. Matcher and all its dependencies are open-source, can be used for free, and are available with containerized deployment from code at https://github.com/Merck/Matcher. Matcher makes large structure/property data sets more transparent than ever before and accelerates the data-driven solution of common problems in drug discovery.

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jcim.3c00015DOI Listing

Publication Analysis

Top Keywords

structure/property data
12
data sets
12
mmp analysis
12
matcher open-source
8
open-source application
8
large structure/property
8
data
8
problems drug
8
drug discovery
8
expertise matcher
8

Similar Publications

Article Synopsis
  • Nano-self-assembly of natural organic matter (NOM) plays a critical role in affecting both NOM and pollutant dynamics in complex environments, highlighting the need for advanced analysis methods.
  • Machine learning (ML) is proposed as a valuable tool for interpreting NOM self-assembly processes by utilizing big data to explore structure-property relationships and environmental impacts.
  • The review emphasizes the importance of developing new ML algorithms and frameworks to address challenges in data interpretation, while also proposing an integrated research approach that combines ML, experiments, and theoretical models for better understanding NOM-related environmental issues.
View Article and Find Full Text PDF

Clustering Cu-S based compounds using periodic table representation and compositional Wasserstein distance.

Sci Rep

December 2024

Key Laboratory of Computing Power Network and Information Security, Shandong Computer Science Center (National Supercomputing Center in Jinan), Ministry of Education, Qilu University of Technology (Shandong Academy of Sciences), Jinan, 250013, Shandong, P. R. China.

Crystal structure similarity is useful for the chemical analysis of nowadays big materials databases and data mining new materials. Here we propose to use two-dimensional Wasserstein distance (earth mover's distance) to measure the compositional similarity between different compounds, based on the periodic table representation of compositions. To demonstrate the effectiveness of our approach, 1586 Cu-S based compounds are taken from the inorganic crystal structure database (ICSD) to form a validation dataset.

View Article and Find Full Text PDF

Automatic Prediction of Molecular Properties Using Substructure Vector Embeddings within a Feature Selection Workflow.

J Chem Inf Model

December 2024

Cavendish Laboratory, Department of Physics, University of Cambridge, J. J. Thomson Avenue, Cambridge CB3 0HE, U.K.

Machine learning (ML) methods provide a pathway to accurately predict molecular properties, leveraging patterns derived from structure-property relationships within materials databases. This approach holds significant importance in drug discovery and materials design, where the rapid, efficient screening of molecules can accelerate the development of new pharmaceuticals and chemical materials for highly specialized target application. Unsupervised and self-supervised learning methods applied to graph-based or geometric models have garnered considerable traction.

View Article and Find Full Text PDF

A computational framework to optimize the mechanical behavior of synthetic vascular grafts.

J Mech Behav Biomed Mater

December 2024

Department of Biomedical Engineering, The University of Utah, 36 S Wasatch Dr, Salt Lake City, UT, 84112, USA; Department of Biomedical Engineering, Texas A&M University, 101 Bizzell St, College Station, TX, 77843, USA; Scientific Computing and Imaging Institute, The University of Utah, 72 Central Campus Dr, Salt Lake City, UT, 84112, USA; School of Engineering Medicine, Texas A&M University, 1020 Holcombe Blvd., Houston, TX, 77030, USA; Department of Multidisciplinary Engineering, Texas A&M University, 101 Bizzell St, College Station, TX, 77843, USA; Department of Cardiovascular Sciences, Houston Methodist Academic Institute, 6565 Fannin Street, Houston, TX, 77030, USA. Electronic address:

The failure of synthetic small-diameter vascular grafts has been attributed to a mismatch in the compliance between the graft and native artery, driving mechanisms that promote thrombosis and neointimal hyperplasia. Additionally, the buckling of grafts results in large deformations that can lead to device failure. Although design features can be added to lessen the buckling potential (e.

View Article and Find Full Text PDF

Machine Learning in Solid-State Hydrogen Storage Materials: Challenges and Perspectives.

Adv Mater

December 2024

State Key Laboratory of Silicon and Advanced Semiconductor Materials, School of Materials Science and Engineering, Zhejiang University, Hangzhou, Zhejiang, 310058, China.

Machine learning (ML) has emerged as a pioneering tool in advancing the research application of high-performance solid-state hydrogen storage materials (HSMs). This review summarizes the state-of-the-art research of ML in resolving crucial issues such as low hydrogen storage capacity and unfavorable de-/hydrogenation cycling conditions. First, the datasets, feature descriptors, and prevalent ML models tailored for HSMs are described.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!