The sequence-structure-function relationships that ultimately generate the diversity of extant observed proteins is complex, as proteins bridge the gap between multiple informational and physical scales involved in nearly all cellular processes. One limitation of existing protein annotation databases such as UniProt is that less than 1% of proteins have experimentally verified functions, and computational methods are needed to fill in the missing information. Here, we demonstrate that a multi-aspect framework based on protein language models can learn sequence-structure-function representations of amino acid sequences, and can provide the foundation for sensitive sequence-structure-function aware protein sequence search and annotation. Based on this model, we introduce a multi-aspect information retrieval system for proteins, Protein-Vec, covering sequence, structure, and function aspects, that enables computational protein annotation and function prediction at tree-of-life scales.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10690258 | PMC |
http://dx.doi.org/10.1101/2023.11.26.568742 | DOI Listing |
J Med Chem
January 2025
Cardio-Vascular and Metabolism, Sanofi R&D, 13 quai Jules Guesde, Vitry sur Seine 94400, France.
Peptide , a C18 fatty acid-modified single-chain relaxin analogue, was recently identified as a potent, selective, and long-lasting relaxin family peptide receptor 1 (RXFP1) agonist. Further advanced pharmacokinetic profiling of this compound highlighted elevated levels of oxidative metabolism occurring in dogs and mini pigs but only marginally in rats. This study aimed to design long-lasting relaxin analogues with increased stability against metabolic oxidation while securing subnanomolar RXFP1 potency.
View Article and Find Full Text PDFFEBS J
January 2025
Molecular Biology and Genetics Department, Ihsan Dogramaci Bilkent University, Ankara, Turkey.
Transcription, a crucial step in the regulation of gene expression, is tightly controlled and involves several essential processes, such as chromatin organization, recognition of the specific genomic sequences, DNA binding, and ultimately recruiting the transcriptional machinery to facilitate transcript synthesis. At the center of this regulation are transcription factors (TFs), which comprise at least one DNA-binding domain (DBD) and an effector domain (ED). Although the structure and function of DBDs have been well studied, our knowledge of the structure and function of effector domains is limited.
View Article and Find Full Text PDFAngew Chem Int Ed Engl
January 2025
Kunming Institute of Botany Chinese Academy of Sciences, State Key Laboratory of Phytochemistry and Plant Resources in West China, 132 Lanhei Road, 650201, Kunming, CHINA.
A polysaccharide APS-1 II from a medicinal plant Angelica sinensis represents an interesting therapeutic agent against leukemia. However, the synthetic accessibility of the highly branched and complex APS-1 II polysaccharide with multiple 1, 2-cis-glycosidic linkages remains a difficult task, impeding the in-depth structure-activity relationship biological studies and the development of carbohydrates-based therapeutics against leukemia. Here, we report the first chemical synthesis of tridecasaccharide repeating unit together with shorter sequences 4-mer, 6-mer and 9-mer from APS-1 II polysaccharide via one-pot orthogonal glycosylation strategy based on glycosyl ortho-(1-phenylvinyl)benzoates, which precluded the potential issues such as aglycone transfer associated with one-pot assembly with thioglycosides.
View Article and Find Full Text PDFJ Basic Microbiol
January 2025
School of Chemical and Environmental Engineering, China University of Mining and Technology-Beijing, Beijing, China.
Subsidence from coal mining is a major environmental issue, causing significant damage to soil structure. Soil microorganisms, highly sensitive to environmental changes, adapt accordingly. This study focused on four areas of the Burdai coal mine: a non-subsidence area (CK), half-yearly (HY), 1-year (OY), and 2-year (TY) subsidence areas.
View Article and Find Full Text PDFAdv Sci (Weinh)
January 2025
Institute for Chemical Research (IIQ), Scientific Research Center "Isla de la Cartuja" (cicCartuja), University of Seville-CSIC, Avda. Americo Vespucio 49, Seville, 41092, Spain.
Gene duplication has allowed protein evolution toward novel functions and mechanisms. The differences between paralogous genes frequently rely on the sequence of disordered regions. For instance, in mammals, the chaperones ANP32A and ANP32B share a common evolutionary line and have some exchangeable functions based on their similar N-terminal domains.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!