Deep learning has brought a dramatic development in molecular property prediction that is crucial in the field of drug discovery using various representations such as fingerprints, SMILES, and graphs. In particular, SMILES is used in various deep learning models via character-based approaches. However, SMILES has a limitation in that it is hard to reflect chemical properties. In this paper, we propose a new self-supervised method to learn SMILES and chemical contexts of molecules simultaneously in pre-training the Transformer. The key of our model is learning structures with adjacency matrix embedding and learning logics that can infer descriptors via Quantitative Estimation of Drug-likeness prediction in pre-training. As a result, our method improves the generalization of the data and achieves the best average performance by benchmarking downstream tasks. Moreover, we develop a web-based fine-tuning service to utilize our model on various tasks.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8155205PMC
http://dx.doi.org/10.1038/s41598-021-90259-7DOI Listing

Publication Analysis

Top Keywords

deep learning
8
learning
5
merged molecular
4
molecular representation
4
representation learning
4
learning molecular
4
molecular properties
4
properties prediction
4
prediction web-based
4
web-based service
4

Similar Publications

A multicenter study of neurofibromatosis type 1 utilizing deep learning for whole body tumor identification.

NPJ Digit Med

January 2025

Neurofibromatosis Type 1 Center and Laboratory for Neurofibromatosis Type 1 Research, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200011, China.

Deep-learning models have shown promise in differentiating between benign and malignant lesions. Previous studies have primarily focused on specific anatomical regions, overlooking tumors occurring throughout the body with highly heterogeneous whole-body backgrounds. Using neurofibromatosis type 1 (NF1) as an example, this study developed highly accurate MRI-based deep-learning models for the early automated screening of malignant peripheral nerve sheath tumors (MPNSTs) against complex whole-body background.

View Article and Find Full Text PDF

We aimed to build a robust classifier for the MGMT methylation status of glioblastoma in multiparametric MRI. We focused on multi-habitat deep image descriptors as our basic focus. A subset of the BRATS 2021 MGMT methylation dataset containing both MGMT class labels and segmentation masks was used.

View Article and Find Full Text PDF

Exploring the potential of advanced artificial intelligence technology in predicting microsatellite instability (MSI) and Ki-67 expression of endometrial cancer (EC) is highly significant. This study aimed to develop a novel hybrid radiomics approach integrating multiparametric magnetic resonance imaging (MRI), deep learning, and multichannel image analysis for predicting MSI and Ki-67 status. A retrospective study included 156 EC patients who were subsequently categorized into MSI and Ki-67 groups.

View Article and Find Full Text PDF

In order to solve the limitations of flipped classroom in personalized teaching and interactive effect improvement, this paper designs a new model of flipped classroom in colleges and universities based on Virtual Reality (VR) by combining the algorithm of Contrastive Language-Image Pre-Training (CLIP). Through cross-modal data fusion, the model deeply combines students' operation behavior with teaching content, and improves teaching effect through intelligent feedback mechanism. The test data shows that the similarity between video and image modes reaches 0.

View Article and Find Full Text PDF

Patients with High-Grade Serous Ovarian Cancer (HGSOC) exhibit varied responses to treatment, with 20-30% showing de novo resistance to platinum-based chemotherapy. While hematoxylin-eosin (H&E)-stained pathological slides are used for routine diagnosis of cancer type, they may also contain diagnostically useful information about treatment response. Our study demonstrates that combining H&E-stained whole slide images (WSIs) with proteomic signatures using a multimodal deep learning framework significantly improves the prediction of platinum response in both discovery and validation cohorts.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!