Publications by authors named "Justas Dauparas"

Recent advances in computational methods have led to considerable progress in the design of self-assembling protein nanoparticles. However, nearly all nanoparticles designed to date exhibit strict point group symmetry, with each subunit occupying an identical, symmetrically related environment. This property limits the structural diversity that can be achieved and precludes anisotropic functionalization.

View Article and Find Full Text PDF

Modeling the conformational heterogeneity of protein-small molecule systems is an outstanding challenge. We reasoned that while residue level descriptions of biomolecules are efficient for de novo structure prediction, for probing heterogeneity of interactions with small molecules in the folded state an entirely atomic level description could have advantages in speed and generality. We developed a graph neural network called ChemNet trained to recapitulate correct atomic positions from partially corrupted input structures from the Cambridge Structural Database and the Protein Data Bank; the nodes of the graph are the atoms in the system.

View Article and Find Full Text PDF
Article Synopsis
  • - We developed a method to create small proteins that can bind strongly to specific molecules, using advanced deep learning techniques to design their shapes based on repeating structural units.
  • - We test these designs by docking various small molecules into the optimal binding sites and then experimentally validate which designs have the highest binding affinity.
  • - Our successful designs include binders for diverse molecules like methotrexate and thyroxine, and we also used our designs to create systems for chemical dimerization and sensitive nanopore sensors that reassemble when a molecule is added.
View Article and Find Full Text PDF

De novo design of complex protein folds using solely computational means remains a substantial challenge. Here we use a robust deep learning pipeline to design complex folds and soluble analogues of integral membrane proteins. Unique membrane topologies, such as those from G-protein-coupled receptors, are not found in the soluble proteome, and we demonstrate that their structural features can be recapitulated in solution.

View Article and Find Full Text PDF

The design of protein-protein interfaces using physics-based design methods such as Rosetta requires substantial computational resources and manual refinement by expert structural biologists. Deep learning methods promise to simplify protein-protein interface design and enable its application to a wide variety of problems by researchers from various scientific disciplines. Here, we test the ability of a deep learning method for protein sequence design, ProteinMPNN, to design two-component tetrahedral protein nanomaterials and benchmark its performance against Rosetta.

View Article and Find Full Text PDF
Article Synopsis
  • Designing complex protein folds using only computation is tough, but researchers have utilized a deep learning pipeline to create soluble versions of integral membrane proteins.
  • They focused on unique structures, particularly from GPCRs, showing that these features can actually work outside of a cell membrane in a soluble form.
  • The results showed that these soluble proteins are not only stable but also maintain their functions, opening up new avenues for drug discovery and expanding the variety of functional protein designs.
View Article and Find Full Text PDF
Article Synopsis
  • Wooden house frames use simple geometric shapes for construction, while designing protein assemblies is more complex due to their irregular structures.
  • This research introduces extendable protein building blocks that follow specific geometric standards, allowing for modular assembly that can be adjusted in size and shape.
  • The team validates their protein nanomaterial designs through advanced imaging techniques, making it possible to construct large protein assemblies using straightforward architectural blueprints.
View Article and Find Full Text PDF

Natural proteins are highly optimized for function but are often difficult to produce at a scale suitable for biotechnological applications due to poor expression in heterologous systems, limited solubility, and sensitivity to temperature. Thus, a general method that improves the physical properties of native proteins while maintaining function could have wide utility for protein-based technologies. Here, we show that the deep neural network ProteinMPNN, together with evolutionary and structural information, provides a route to increasing protein expression, stability, and function.

View Article and Find Full Text PDF
Article Synopsis
  • * The approach involves creating customizable binding pockets, or pseudocycles, that can adapt to different small molecule targets by adjusting their size and shape for high affinity interactions.
  • * The researchers successfully designed protein binders for various molecules, including polar flexible ones like methotrexate and thyroxine, achieving strong binding affinities, and demonstrating the application of these designs in low noise nanopore sensors.
View Article and Find Full Text PDF

Despite transformative advances in protein design with deep learning, the design of small-molecule-binding proteins and sensors for arbitrary ligands remains a grand challenge. Here we combine deep learning and physics-based methods to generate a family of proteins with diverse and designable pocket geometries, which we employ to computationally design binders for six chemically and structurally distinct small-molecule targets. Biophysical characterization of the designed binders revealed nanomolar to low micromolar binding affinities and atomic-level design accuracy.

View Article and Find Full Text PDF
Article Synopsis
  • Pseudocyclic proteins, like TIM barrels and β barrels, have a repeating subunit structure that creates a central cavity for binding ligands or facilitating enzymatic activity.
  • A new deep-learning approach was developed to explore a variety of closed repeat proteins based on specific parameters like repeat number and length.
  • Experimental data from diverse pseudocyclic designs shows alignment with the design models, and the crystal structures confirm the accuracy of the designs, suggesting potential for developing small-molecule binders and enzymes.
View Article and Find Full Text PDF
Article Synopsis
  • Proteins can change shapes in response to environmental signals, similar to how transistors manage information flow in computers.
  • Designing proteins with two stable shapes is complex, as it involves creating a specific energy landscape with two low-energy states.
  • The study presents "hinge" proteins that switch between two accurately designed states—one when a ligand is absent and one when it is present—validated through advanced imaging and spectroscopy techniques.
View Article and Find Full Text PDF

The design of novel protein-protein interfaces using physics-based design methods such as Rosetta requires substantial computational resources and manual refinement by expert structural biologists. A new generation of deep learning methods promises to simplify protein-protein interface design and enable its application to a wide variety of problems by researchers from various scientific disciplines. Here we test the ability of a deep learning method for protein sequence design, ProteinMPNN, to design two-component tetrahedral protein nanomaterials and benchmark its performance against Rosetta.

View Article and Find Full Text PDF

Advances in DNA sequencing and machine learning are providing insights into protein sequences and structures on an enormous scale. However, the energetics driving folding are invisible in these structures and remain largely unknown. The hidden thermodynamics of folding can drive disease, shape protein evolution and guide protein engineering, and new approaches are needed to reveal these thermodynamics for every sequence and structure.

View Article and Find Full Text PDF
Article Synopsis
  • The text discusses the construction of protein assemblies using extendable building blocks that follow specific geometric rules, similar to how a wooden house frame is built from regular lumber pieces.
  • It highlights the development and validation of various protein designs, from simple shapes to complex nanostructures, using techniques like X-ray crystallography and electron microscopy.
  • This approach allows for the deliberate assembly of large protein structures onto a 3D canvas, overcoming previous challenges related to the irregularity of protein shapes, and enables easier design of protein nanomaterials.
View Article and Find Full Text PDF

Recently it has become possible to de novo design high affinity protein binding proteins from target structural information alone. There is, however, considerable room for improvement as the overall design success rate is low. Here, we explore the augmentation of energy-based protein binder design using deep learning.

View Article and Find Full Text PDF

De novo enzyme design has sought to introduce active sites and substrate-binding pockets that are predicted to catalyse a reaction of interest into geometrically compatible native scaffolds, but has been limited by a lack of suitable protein structures and the complexity of native protein sequence-structure relationships. Here we describe a deep-learning-based 'family-wide hallucination' approach that generates large numbers of idealized protein structures containing diverse pocket shapes and designed sequences that encode them. We use these scaffolds to design artificial luciferases that selectively catalyse the oxidative chemiluminescence of the synthetic luciferin substrates diphenylterazine and 2-deoxycoelenterazine.

View Article and Find Full Text PDF

Peptide-binding proteins play key roles in biology, and predicting their binding specificity is a long-standing challenge. While considerable protein structural information is available, the most successful current methods use sequence information alone, in part because it has been a challenge to model the subtle structural changes accompanying sequence substitutions. Protein structure prediction networks such as AlphaFold model sequence-structure relationships very accurately, and we reasoned that if it were possible to specifically train such networks on binding data, more generalizable models could be created.

View Article and Find Full Text PDF

Motivation: Multiple sequence alignments (MSAs) of homologous sequences contain information on structural and functional constraints and their evolutionary histories. Despite their importance for many downstream tasks, such as structure prediction, MSA generation is often treated as a separate pre-processing step, without any guidance from the application it will be used for.

Results: Here, we implement a smooth and differentiable version of the Smith-Waterman pairwise alignment algorithm that enables jointly learning an MSA and a downstream machine learning system in an end-to-end fashion.

View Article and Find Full Text PDF

The binding and catalytic functions of proteins are generally mediated by a small number of functional residues held in place by the overall protein structure. Here, we describe deep learning approaches for scaffolding such functional sites without needing to prespecify the fold or secondary structure of the scaffold. The first approach, "constrained hallucination," optimizes sequences such that their predicted structures contain the desired functional site.

View Article and Find Full Text PDF

The established approach to unsupervised protein contact prediction estimates coevolving positions using undirected graphical models. This approach trains a Potts model on a Multiple Sequence Alignment. Increasingly large Transformers are being pretrained on unlabeled, unaligned protein sequence databases and showing competitive performance on protein contact prediction.

View Article and Find Full Text PDF

The trRosetta structure prediction method employs deep learning to generate predicted residue-residue distance and orientation distributions from which 3D models are built. We sought to improve the method by incorporating as inputs (in addition to sequence information) both language model embeddings and template information weighted by sequence similarity to the target. We also developed a refinement pipeline that recombines models generated by template-free and template utilizing versions of trRosetta guided by the DeepAccNet accuracy predictor.

View Article and Find Full Text PDF

DeepMind presented notably accurate predictions at the recent 14th Critical Assessment of Structure Prediction (CASP14) conference. We explored network architectures that incorporate related ideas and obtained the best performance with a three-track network in which information at the one-dimensional (1D) sequence level, the 2D distance map level, and the 3D coordinate level is successively transformed and integrated. The three-track network produces structure predictions with accuracies approaching those of DeepMind in CASP14, enables the rapid solution of challenging x-ray crystallography and cryo-electron microscopy structure modeling problems, and provides insights into the functions of proteins of currently unknown structure.

View Article and Find Full Text PDF

We develop a deep learning framework (DeepAccNet) that estimates per-residue accuracy and residue-residue distance signed error in protein models and uses these predictions to guide Rosetta protein structure refinement. The network uses 3D convolutions to evaluate local atomic environments followed by 2D convolutions to provide their global contexts and outperforms other methods that similarly predict the accuracy of protein structure models. Overall accuracy predictions for X-ray and cryoEM structures in the PDB correlate with their resolution, and the network should be broadly useful for assessing the accuracy of both predicted structure models and experimentally determined structures and identifying specific regions likely to be in error.

View Article and Find Full Text PDF

Motile subpopulations in microbial communities are believed to be important for dispersal, quest for food, and material transport. Here, we show that motile cells in sessile colonies of peritrichously flagellated bacteria can self-organize into two adjacent, centimeter-scale motile rings surrounding the entire colony. The motile rings arise from spontaneous segregation of a homogeneous swimmer suspension that mimics a phase separation; the process is mediated by intercellular interactions and shear-induced depletion.

View Article and Find Full Text PDF