Unsupervised Machine Learning Organization of the Functional Dark Proteome of Gram-Negative "Superbugs": Six Protein Clusters Amenable for Distinct Scientific Applications.

Carlos Sicilia Andrés Corral-Lugo Pawel Smialowski Michael J McConnell Antonio J Martín-Galiano

ACS Omega

Intrahospital Infections Laboratory, National Centre for Microbiology, Instituto de Salud Carlos III (ISCIII), Majadahonda, 28220 Madrid, Spain.

Published: December 2022

Uncharacterized proteins are underexplored as potential therapies for tough bacterial infections, with a focus on 2819 predicted proteins from multidrug-resistant strains.
An unsupervised machine learning algorithm classified these proteins into six natural clusters based on factors like length, hydrophobicity, and structural disorder, revealing variations in operon membership and functional domains.
Clusters 1, 3, and 6 contained proteins resembling known drug targets and potential vaccine candidates, suggesting a promising direction for new drug and vaccine development.

Uncharacterized proteins have been underutilized as targets for the development of novel therapeutics for difficult-to-treat bacterial infections. To facilitate the exploration of these proteins, 2819 predicted, uncharacterized proteins (19.1% of the total) from reference strains of multidrug , , and species were organized using an unsupervised machine learning algorithm. Classification using normalized values for protein length, pI, hydrophobicity, degree of conservation, structural disorder, and %AT of the coding gene rendered six natural clusters. Cluster proteins showed different trends regarding operon membership, expression, presence of unknown function domains, and interactomic relevance. Clusters 2, 4, and 5 were enriched with highly disordered proteins, nonworkable membrane proteins, and likely spurious proteins, respectively. Clusters 1, 3, and 6 showed closer distances to known antigens, antibiotic targets, and virulence factors. Up to 21.8% of proteins in these clusters were structurally covered by modeling, which allowed assessment of druggability and discontinuous B-cell epitopes. Five proteins (4 in Cluster 1) were potential druggable targets for antibiotherapy. Eighteen proteins (11 in Cluster 6) were strong B-cell and T-cell immunogen candidates for vaccine development. Conclusively, we provide a feature-based schema to fractionate the functional dark proteome of critical pathogens for fundamental and biomedical purposes.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9774411	PMC
http://dx.doi.org/10.1021/acsomega.2c04076	DOI Listing

Publication Analysis

Top Keywords

proteins

unsupervised machine

machine learning

functional dark

dark proteome

uncharacterized proteins

proteins clusters

proteins cluster

clusters

learning organization

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!