Molecular structure prediction and homology detection offer promising paths to discovering protein function and evolutionary relationships. However, current approaches lack statistical reliability assurances, limiting their practical utility for selecting proteins for further experimental and in-silico characterization. To address this challenge, we introduce a statistically principled approach to protein search leveraging principles from conformal prediction, offering a framework that ensures statistical guarantees with user-specified risk and provides calibrated probabilities (rather than raw ML scores) for any protein search model. Our method (1) lets users select many biologically-relevant loss metrics (i.e. false discovery rate) and assigns reliable functional probabilities for annotating genes of unknown function; (2) achieves state-of-the-art performance in enzyme classification without training new models; and (3) robustly and rapidly pre-filters proteins for computationally intensive structural alignment algorithms. Our framework enhances the reliability of protein homology detection and enables the discovery of uncharacterized proteins with likely desirable functional properties.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11695924PMC
http://dx.doi.org/10.1038/s41467-024-55676-yDOI Listing

Publication Analysis

Top Keywords

homology detection
8
protein search
8
functional protein
4
protein mining
4
mining conformal
4
conformal guarantees
4
guarantees molecular
4
molecular structure
4
structure prediction
4
prediction homology
4

Similar Publications

Performance of the PhoP (Rv0757/Mb0780) protein as diagnostic antigen for bovine tuberculosis.

Res Vet Sci

December 2024

Instituto Nacional de Tecnología Agropecuaria, Instituto de Agrobiotecnología y Biología Molecular (IB-IABiMo), UEDD INTA-CONICET, Hurlingham, Buenos Aires, Argentina; CONICET, Argentina. Electronic address:

Bovine tuberculosis (bTB), a global zoonotic disease, causes negative effects on human and animal health. PhoP protein is a key regulator of pathogenic phenotypes in members of the Mycobacterium tuberculosis complex, which includes the causative agent of bTB. Despite extensive research on this protein focused in deciphering its regulatory role, little was explored about it as a diagnostic antigen.

View Article and Find Full Text PDF

Molecular structure prediction and homology detection offer promising paths to discovering protein function and evolutionary relationships. However, current approaches lack statistical reliability assurances, limiting their practical utility for selecting proteins for further experimental and in-silico characterization. To address this challenge, we introduce a statistically principled approach to protein search leveraging principles from conformal prediction, offering a framework that ensures statistical guarantees with user-specified risk and provides calibrated probabilities (rather than raw ML scores) for any protein search model.

View Article and Find Full Text PDF

This study aimed to analyze the homology between carbapenem-resistant organisms (CRO) intestinal colonization strains and bloodstream infection (BSI) strains in patients undergoing hematopoietic stem cell transplantation (HSCT), confirming the clinical use of the real-time rectal swab Xpert Carba-R assay, and investigate its feasibility in early warning of BSI. Drug-resistant strains obtained from rectal swabs and blood culture samples of patients undergoing the same HSCT from January 2021 to December 2021 were collected and analyzed. The homology of the CRO intestinal colonization and BSI strains was confirmed using strain identification, antimicrobial resistance phenotyping, whole genome sequencing (WGS), multilocus sequence typing (MLST), and carbapenemase type identification.

View Article and Find Full Text PDF

Background: We sought to determine whether transamniotic stem cell therapy (TRASCET) could be a viable alternative for the fetal administration of genetically modified hematopoietic stem cells (HSCs) carrying a human hemoglobin subunit beta gene (hHBB) in a healthy syngeneic rat model.

Methods: Time-dated pregnant Lewis dams underwent volume-matched intra-amniotic injections in all their fetuses (n = 61) of a suspension of donor HSCs genetically modified with either both a hHBB gene and a firefly luciferase reporter gene (n = 42) or the firefly luciferase reporter gene alone to control for HBB-derived protein interspecies homology (n = 19) on gestational day 17 (E17; term = E21). Donor HSCs consisted of syngeneic cells phenotyped by flow cytometry with successful hHBB transduction confirmed by ELISA prior to administration in vivo.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!