Interpretability and reliability of deep learning models are important for computer-based drug discovery. Aiming to understand feature perception by such a model, we investigate a graph neural network for affinity prediction of protein-ligand complexes. We assess a latent representation of ligand binding sites and investigate underlying geometric structure in this latent space and its relation to protein function. We introduce an automated computational pipeline for dimensionality reduction, clustering, hypothesis testing, and visualization of latent space. The results indicate that the learned protein latent space is inherently structured and not randomly distributed. Several of the identified protein binding site clusters in latent space correspond to functional protein families. Ligand size was found to be a determinant of cluster geometry. The computational pipeline proved applicable to latent space analysis and interpretation and can be adapted to work for different datasets and deep learning models.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1002/minf.202400205 | DOI Listing |
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11733832 | PMC |
Sci Rep
January 2025
College of computer science and technology, China University of Petroleum (East China), No.66 Changjiang West Road, Huangdao, Qingdao, 266580, Shandong, China.
Addressing the issues of inadequate information exchange among subsequences in the operational time series of water injection pumps, leading to low accuracy and high false alarm rates in anomaly detection, this paper proposes a multidimensional time series anomaly detection method for water injection pump operations, leveraging Long Short-Term Memory Autoencoder augmented with Attention Mechanism (LSTMA-AE) and mechanistic constraints. The LSTMA-AE framework encompasses three primary modules: a Time Feature Extraction Module (Encoder), an Attention Layer, and a Data Reconstruction Module (Decoder). The Encoder captures temporal dependencies and features within the input sequences, mapping the input data into a higher-dimensional space.
View Article and Find Full Text PDFSci Rep
January 2025
Department of Basic Sciences, Faculty of Dentistry, Universidad de Antioquia U de A, Medellín, 050010, Colombia.
The NLRP3 inflammasome, regulated by TLR4, plays a pivotal role in periodontitis by mediating inflammatory cytokine release and bone loss induced by Porphyromonas gingivalis. Periodontal disease creates a hypoxic environment, favoring anaerobic bacteria survival and exacerbating inflammation. The NLRP3 inflammasome triggers pyroptosis, a programmed cell death that amplifies inflammation and tissue damage.
View Article and Find Full Text PDFBrief Bioinform
November 2024
School of Computer Science and Technology, Harbin Institute of Technology, HIT Campus, Shenzhen University Town, Nanshan District, Shenzhen 518055, Guangdong, China.
Antimicrobial peptides (AMPs) emerge as a type of promising therapeutic compounds that exhibit broad spectrum antimicrobial activity with high specificity and good tolerability. Natural AMPs usually need further rational design for improving antimicrobial activity and decreasing toxicity to human cells. Although several algorithms have been developed to optimize AMPs with desired properties, they explored the variations of AMPs in a discrete amino acid sequence space, usually suffering from low efficiency, lack diversity, and local optimum.
View Article and Find Full Text PDFBrief Bioinform
November 2024
Department of Electronic Engineering, Tsinghua University, 100084 Beijing, China.
Single-cell multi-omics techniques, which enable the simultaneous measurement of multiple modalities such as RNA gene expression and Assay for Transposase-Accessible Chromatin (ATAC) within individual cells, have become a powerful tool for deciphering the intricate complexity of cellular systems. Most current methods rely on motif databases to establish cross-modality relationships between genes from RNA-seq data and peaks from ATAC-seq data. However, these approaches are constrained by incomplete database coverage, particularly for novel or poorly characterized relationships.
View Article and Find Full Text PDFNPJ Digit Med
January 2025
Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA.
The 12-lead electrocardiogram (ECG) is inexpensive and widely available. Whether conditions across the human disease landscape can be detected using the ECG is unclear. We developed a deep learning denoising autoencoder and systematically evaluated associations between ECG encodings and ~1,600 Phecode-based diseases in three datasets separate from model development, and meta-analyzed the results.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!