PGAT-ABPp: harnessing protein language models and graph attention networks for antibacterial peptide identification with remarkable accuracy.

Bioinformatics

Research Center for Analytical Sciences, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, College of Chemistry, Nankai University, Tianjin 300071, China.

Published: August 2024

Motivation: The emergence of drug-resistant pathogens represents a formidable challenge to global health. Using computational methods to identify the antibacterial peptides (ABPs), an alternative antimicrobial agent, has demonstrated advantages in further drug design studies. Most of the current approaches, however, rely on handcrafted features and underutilize structural information, which may affect prediction performance.

Results: To present an ultra-accurate model for ABP identification, we propose a novel deep learning approach, PGAT-ABPp. PGAT-ABPp leverages structures predicted by AlphaFold2 and a pretrained protein language model, ProtT5-XL-U50 (ProtT5), to construct graphs. Then the graph attention network (GAT) is adopted to learn global discriminative features from the graphs. PGAT-ABPp outperforms the other fourteen state-of-the-art models in terms of accuracy, F1-score and Matthews Correlation Coefficient on the independent test dataset. The results show that ProtT5 has significant advantages in the identification of ABPs and the introduction of spatial information further improves the prediction performance of the model. The interpretability analysis of key residues in known active ABPs further underscores the superiority of PGAT-ABPp.

Availability And Implementation: The datasets and source codes for the PGAT-ABPp model are available at https://github.com/moonseter/PGAT-ABPp/.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11338452PMC
http://dx.doi.org/10.1093/bioinformatics/btae497DOI Listing

Publication Analysis

Top Keywords

protein language
8
graph attention
8
pgat-abpp
5
pgat-abpp harnessing
4
harnessing protein
4
language models
4
models graph
4
attention networks
4
networks antibacterial
4
antibacterial peptide
4

Similar Publications

Extreme Synergy in the Random-Energy Model.

Phys Rev Lett

December 2024

Initiative for the Theoretical Sciences and CUNY-Princeton Center for the Physics of Biological Function, The Graduate Center, CUNY, New York, New York 10016, USA.

The random-energy model (REM), a solvable spin-glass model, has impacted an incredibly diverse set of problems, from protein folding to combinatorial optimization, to many-body localization. Here, we explore a new connection to secret sharing. We derive an analytic expression for the mutual information between any two disjoint thermodynamic subsystems of the REM.

View Article and Find Full Text PDF

Background: The Apolipoprotein E ε4 (APOE-ε4) allele is common in the population, but acts as the strongest genetic risk factor for late-onset Alzheimer's disease (AD). Despite the strength of the association, there is notable heterogeneity in the population including a strong modifying effect of genetic ancestry, with the APOE-ε4 allele showing a stronger association among individuals of European ancestry (EUR) compared to individuals of African ancestry (AFR). Given this heterogeneity, we sought to identify genetic modifiers of APOE-ε4 related to cognitive decline leveraging APOE-ε4 stratified and interaction genome-wide association analyses (GWAS).

View Article and Find Full Text PDF

Basic Science and Pathogenesis.

Alzheimers Dement

December 2024

AviadoBio, London, London, United Kingdom.

Background: Frontotemporal dementia (FTD) presents with a change in personality, behaviour and language and is the second most common cause of young-onset dementia after Alzheimer's disease. Loss of function mutations in GRN, encoding progranulin (PGRN), causes FTD in the heterozygous state, accounting for 5-10% of all FTD cases. PGRN is essential for normal lysosomal function and neuronal survival.

View Article and Find Full Text PDF

Background: Alzheimer disease (AD) involves neurodegenerative disorders with progressive cognitive decline. Atypical presentations like Posterior Cortical Atrophy (PCA) and Logopenic Variant Primary Progressive Aphasia (lvPPA) exhibit distinct clinical profiles. PCA affects the posterior parietal and occipital lobes, causing visuospatial deficits, while lvPPA manifests as language impairment in the temporoparietal region.

View Article and Find Full Text PDF

Basic Science and Pathogenesis.

Alzheimers Dement

December 2024

Vanderbilt Memory & Alzheimer's Center, Vanderbilt University Medical Center, Nashville, TN, USA.

Background: "SuperAgers" are older adults (ages 80+) whose cognitive performance resembles that of adults in their 50s to mid-60s. Factors underlying their exemplary aging are underexplored in large, racially diverse cohorts. Using eight cohorts, we investigated the frequency of APOE genotypes in SuperAgers compared to middle-aged and older adults.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!