With the consolidation of deep learning in drug discovery, several novel algorithms for learning molecular representations have been proposed. Despite the interest of the community in developing new methods for learning molecular embeddings and their theoretical benefits, comparing molecular embeddings with each other and with traditional representations is not straightforward, which in turn hinders the process of choosing a suitable representation for Quantitative Structure-Activity Relationship (QSAR) modeling. A reason behind this issue is the difficulty of conducting a fair and thorough comparison of the different existing embedding approaches, which requires numerous experiments on various datasets and training scenarios. To close this gap, we reviewed the literature on methods for molecular embeddings and reproduced three unsupervised and two supervised molecular embedding techniques recently proposed in the literature. We compared these five methods concerning their performance in QSAR scenarios using different classification and regression datasets. We also compared these representations to traditional molecular representations, namely molecular descriptors and fingerprints. As opposed to the expected outcome, our experimental setup consisting of over $25 000$ trained models and statistical tests revealed that the predictive performance using molecular embeddings did not significantly surpass that of traditional representations. Although supervised embeddings yielded competitive results compared with those using traditional molecular representations, unsupervised embeddings tended to perform worse than traditional representations. Our results highlight the need for conducting a careful comparison and analysis of the different embedding techniques prior to using them in drug design tasks and motivate a discussion about the potential of molecular embeddings in computer-aided drug design.

Download full-text PDF

Source
http://dx.doi.org/10.1093/bib/bbab365DOI Listing

Publication Analysis

Top Keywords

molecular embeddings
24
molecular representations
12
traditional representations
12
molecular
11
qsar modeling
8
learning molecular
8
embedding techniques
8
traditional molecular
8
drug design
8
representations
7

Similar Publications

High-quality RNA is crucial in clinical diagnostics and precision medicine. Formalin-fixed and paraffin-embedded (FFPE) tissues pose a challenge due to nucleic acid fragmentation and crosslinking. In this pilot study, various commercially available techniques for extracting RNA from small FFPE samples were compared.

View Article and Find Full Text PDF

Nanotechnological methods for creating multifunctional fabrics are attracting global interest. The incorporation of nanoparticles in the field of textiles enables the creation of multifunctional textiles exhibiting UV irradiation protection, antimicrobial properties, self-cleaning properties and photocatalytic. Nanomaterials-loaded textiles have many innovative applications in pharmaceuticals, sports, military the textile industry etc.

View Article and Find Full Text PDF

A hitchhiker's guide to deep chemical language processing for bioactivity prediction.

Digit Discov

December 2024

Eindhoven University of Technology, Institute for Complex Molecular Systems, Eindhoven AI Systems Institute, Dept. Biomedical Engineering Eindhoven Netherlands

Deep learning has significantly accelerated drug discovery, with 'chemical language' processing (CLP) emerging as a prominent approach. CLP approaches learn from molecular string representations (, Simplified Molecular Input Line Entry Systems [SMILES] and Self-Referencing Embedded Strings [SELFIES]) with methods akin to natural language processing. Despite their growing importance, training predictive CLP models is far from trivial, as it involves many 'bells and whistles'.

View Article and Find Full Text PDF

Insights into the interaction between hemorphins and δ-opioid receptor from molecular modeling.

Front Mol Biosci

December 2024

Department of Biology, College of Science, United Arab Emirates University, Al Ain, United Arab Emirates.

Hemorphins are short atypical opioid peptide fragments embedded in the β-chain of hemoglobin. They have received considerable attention recently due to their interaction with opioid receptors. The affinity of hemorphins to opioid receptors μ-opioid receptor (MOR), δ-opioid receptor (DOR), and κ-opioid receptor (KOR) has been well established.

View Article and Find Full Text PDF

Identification of key genes related to growth of largemouth bass () based on comprehensive transcriptome analysis.

Front Mol Biosci

December 2024

State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-products, Institute of Hydrobiology, Zhejiang Academy of Agricultural Sciences, Hangzhou, China.

Introduction: Largemouth bass is an economically important farmed freshwater fish species that has delicious meat, no intermuscular thorns, and rapid growth rates. However, the molecular regulatory mechanisms underlying the different growth and developmental stages of this fish have not been reported.

Methods: In this study, we performed histological and transcriptomic analyses on the brain and dorsal muscles of largemouth bass at different growth periods.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!