Elucidating how protein sequence determines the properties of disordered proteins and their phase-separated condensates is a great challenge in computational chemistry, biology, and biophysics. Quantitative molecular dynamics simulations and derived free energy values can in principle capture how a sequence encodes the chemical and biological properties of a protein. These calculations are, however, computationally demanding, even after reducing the representation by coarse-graining; exploring the large spaces of potentially relevant sequences remains a formidable task. We employ an "active learning" scheme introduced by Yang (, 2022, https://doi.org/10.1101/2022.08.05.502972) to reduce the number of labelled examples needed from simulations, where a neural network-based model suggests the most useful examples for the next training cycle. Applying this Bayesian optimisation framework, we determine properties of protein sequences with coarse-grained molecular dynamics, which enables the network to establish sequence-property relationships for disordered proteins and their self-interactions and their interactions in phase-separated condensates. We show how iterative training with second virial coefficients derived from the simulations of disordered protein sequences leads to a rapid improvement in predicting peptide self-interactions. We employ this Bayesian approach to efficiently search for new sequences that bind to condensates of the disordered C-terminal domain (CTD) of RNA Polymerase II, by simulating molecular recognition of peptides to phase-separated condensates in coarse-grained molecular dynamics. By searching for protein sequences which prefer to self-interact rather than interact with another protein sequence we are able to shape the morphology of protein condensates and design multiphasic protein condensates.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1039/d4fd00099d | DOI Listing |
Sci Adv
December 2024
Department of Biochemistry and Molecular Pharmacology, Baylor College of Medicine, Houston, TX, USA.
Many viral proteins form biomolecular condensates via liquid-liquid phase separation (LLPS) to support viral replication and evade host antiviral responses, and thus, they are potential targets for designing antivirals. In the case of nonenveloped positive-sense RNA viruses, forming such condensates for viral replication is unclear and less understood. Human noroviruses (HuNoVs) are positive-sense RNA viruses that cause epidemic and sporadic gastroenteritis worldwide.
View Article and Find Full Text PDFPlant Physiol
December 2024
School of Biological Sciences, Nanyang Technological University, Singapore 637551, Singapore.
MYB family transcription factors (TFs) play crucial roles in plant development, metabolism, and responses to various stresses. However, whether MYB TFs are involved in regulating fatty acid biosynthesis in seeds remains largely elusive. Here, we demonstrated that transgenic Arabidopsis (Arabidopsis thaliana) plants overexpressing MYB73 exhibit altered FATTY ACID ELONGATION1 (FAE1) expression, seed oil content, and seed fatty acid composition.
View Article and Find Full Text PDFBiomacromolecules
December 2024
School of Chemistry and the UNSW RNA Institute, UNSW Sydney, Sydney, NSW 2052, Australia.
Membraneless organelles, often referred to as condensates or coacervates, are liquid-liquid phase-separated systems formed between noncoding RNAs and intrinsically disordered proteins. While the importance of different amino acid residues in short peptide-based condensates has been investigated, the role of the individual nucleobases or the type of heterocyclic structures, the purine vs pyrimidine nucleobases, is less researched. The cell's crowded environment has been mimicked to demonstrate its ability to induce the formation of condensates, but more research in this area is required, especially with respect to RNA-facilitated phase separation and the properties of the crowding agent, poly(ethylene glycol) (PEG).
View Article and Find Full Text PDFAdv Biol Regul
November 2024
Department of Biology of the Cell Nucleus, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czech Republic. Electronic address:
J Mol Biol
November 2024
Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France. Electronic address:
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!