Protein design involves searching a vast space for sequences that are compatible with a defined structure. This can pose significant computational challenges. Cluster expansion is a technique that can accelerate the evaluation of protein energies by generating a simple functional relationship between sequence and energy. The method consists of several steps. First, for a given protein structure, a training set of sequences with known energies is generated. Next, this training set is used to expand energy as a function of clusters consisting of single residues, residue pairs, and higher order terms, if required. The accuracy of the sequence-based expansion is monitored and improved using cross-validation testing and iterative inclusion of additional clusters. As a trade-off for evaluation speed, the cluster-expansion approximation causes prediction errors, which can be reduced by including more training sequences, including higher order terms in the expansion, and/or reducing the sequence space described by the cluster expansion. This article analyzes the sources of error and introduces a method whereby accuracy can be improved by judiciously reducing the described sequence space. The method is applied to describe the sequence-stability relationship for several protein structures: coiled-coil dimers and trimers, a PDZ domain, and T4 lysozyme as examples with computationally derived energies, and SH3 domains in amphiphysin-1 and endophilin-1 as examples where the expanded pseudo-energies are obtained from experiments. Our open-source software package Cluster Expansion Version 1.0 allows users to expand their own energy function of interest and thereby apply cluster expansion to custom problems in protein design.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1002/jcc.21585 | DOI Listing |
Nanoscale
January 2025
Photon Science Research Center for Carbon Dioxide, Shanghai Advanced Research Institute, Chinese Academy of Sciences, Shanghai 201210, China.
Oxygen vacancies (V's) are of paramount importance in influencing the properties and applications of ceria (CeO). Yet, comprehending the distribution and nature of V's poses a significant challenge due to the vast number of electronic configurations and intricate many-body interactions among V's and polarons (Ce ions). In this study, we established a cluster expansion model based on first-principles calculations and statistical learning to decouple the interactions among the Ce ions and V's, thereby circumventing the limitations associated with sampling electronic configurations.
View Article and Find Full Text PDFSensors (Basel)
December 2024
Department of Applied Physics, National Defense Academy, Hashirimizu 1-10-20, Yokosuka 239-0802, Kanagawa, Japan.
Dielectrophoresis (DEP) cell separation technology is an effective means of separating target cells which are only marginally present in a wide variety of cells. To develop highly efficient cell separation devices, detailed analysis of the nonuniform electric field's intensity distribution within the device is needed, as it affects separation performance. Here we analytically expressed the distributions of the electric field and DEP force in a parallel-plate cell separation DEP device by employing electrostatic analysis through the Fourier series method.
View Article and Find Full Text PDFPlants (Basel)
December 2024
Sustainable Perennial Crops Laboratory, United States Department of Agriculture, Agriculture Research Service, Beltsville, MD 2005, USA.
is a rare Coffea species boasting a flavor profile comparable to Arabica coffee () and has a good adaptability to lowland tropical climates. This species faces increasing threats from climate change, deforestation, and habitat fragmentation in its West African homeland. Using 1037 novel SNP markers derived from Genotyping-by-Sequencing (GBS), we revealed the presence of three distinct natural populations (mean Fst = 0.
View Article and Find Full Text PDFJ Appl Microbiol
January 2025
School of Computing, Engineering & Physical Sciences, University of the West of Scotland, Paisley PA1 2BE, U.K.
Expansion of the microbial drug discovery pipeline has been impeded by a limited and skewed appreciation of the microbial world and its full chemical capabilities and by an inability to induce silent biosynthetic gene clusters (BGCs). Typically, these silent genes are not expressed under standard laboratory conditions, instead requiring particular interventions to activate them. Genetic, physical, and chemical strategies have been employed to trigger these BGCs, and some have resulted in the induction of novel secondary metabolites.
View Article and Find Full Text PDFJ Environ Manage
January 2025
College of Forestry and Prataculture, Ningxia University, Yinchuan 750021, China.
The wind-blown sand protection system in the Shapotou section of the Baotou-Lanzhou Railway is a representative artificial ecosystem in a desert region. Over the past 70 years, this system has transformed mobile dunes into fixed dunes through vegetation succession, relying solely on natural rainfall without additional irrigation. However, ecosystem sustainability has been endangered by the emergence of numerous blowouts.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!