This work proposes the data augmentation by molecular rotation, with consideration that the protein-ligand binding events are rotation-variant. As a proof-of-concept, known active (i. e., 1-labeled) ligands to human β-secretase 1 (BACE-1) are rotated for the generation of 0-labeled data, and the rotation-dependent prediction accuracy of 3D graph convolutional network (3DGCN) is investigated after data augmentation. The data augmentation makes the orientation-recognizing ability of 3DGCN improved significantly in the classification task for BACE-1/ligand binding. Furthermore, the data-augmented 3DGCN has a capability for predicting active ligands from a candidate dataset, via improved performance of orientation recognition, which would be applied to virtual drug screening and discovery.

Download full-text PDF

Source
http://dx.doi.org/10.1002/asia.202100789DOI Listing

Publication Analysis

Top Keywords

data augmentation
16
graph convolutional
8
convolutional network
8
data
5
rotational variance-based
4
variance-based data
4
augmentation
4
augmentation graph
4
network work
4
work proposes
4

Similar Publications

The L-type Ca channel (Ca1.2) is essential for cardiac excitation-contraction coupling. To contribute to the inward Ca flux that drives Ca-induced-Ca-release, Ca1.

View Article and Find Full Text PDF

Variants of uncertain significance (VUS) represent variants that lack sufficient evidence to be confidently associated with a disease, thus posing a challenge in the interpretation of genetic testing results. Here we report an improved method for predicting the VUS of Arylsulfatase A (ARSA) gene as part of the Critical Assessment of Genome Interpretation challenge (CAGI6). Our method uses a transfer learning approach that leverages a pre-trained protein language model to predict the impact of mutations on the activity of the ARSA enzyme, whose deficiency is known to cause a rare genetic disorder, metachromatic leukodystrophy.

View Article and Find Full Text PDF

Backgrounds: Biomedical research requires sophisticated understanding and reasoning across multiple specializations. While large language models (LLMs) show promise in scientific applications, their capability to safely and accurately support complex biomedical research remains uncertain.

Methods: We present , a novel question-and-answer benchmark for evaluating LLMs in biomedical research.

View Article and Find Full Text PDF

Quantitative measurements produced by mass spectrometry proteomics experiments offer a direct way to explore the role of proteins in molecular mechanisms. However, analysis of such data is challenging due to the large proportion of missing values. A common strategy to address this issue is to utilize an imputed dataset, which often introduces systematic bias into down-stream analyses if the imputation errors are ignored.

View Article and Find Full Text PDF

Two-pore channel regulators - Who is in control?

Front Physiol

January 2025

Walther Straub Institute of Pharmacology and Toxicology, Faculty of Medicine, Ludwig-Maximilians-University, Munich, Germany.

Two-pore channels (TPCs) are adenine nucleotide and phosphoinositide regulated cation channels. NAADP activates and ATP blocks TPCs, while the endolysosomal phosphoinositide PI(3,5)P activates TPCs. TPCs are ubiquitously expressed including expression in the innate as well as the adaptive immune system.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!