Next-generation sequencing technology has created many new opportunities for clinical diagnostics, but it faces the challenge of functional annotation of identified mutations. Various algorithms have been developed to predict the impact of missense variants that influence oncogenic drivers. However, computational pipelines that handle biological data must integrate multiple software tools, which can add complexity and hinder non-specialist users from accessing the pipeline.
View Article and Find Full Text PDFSupervised machine learning (ML) is used extensively in biology and deserves closer scrutiny. The Data Optimization Model Evaluation (DOME) recommendations aim to enhance the validation and reproducibility of ML research by establishing standards for key aspects such as data handling and processing, optimization, evaluation, and model interpretability. The recommendations help to ensure that key details are reported transparently by providing a structured set of questions.
View Article and Find Full Text PDFSystems biology aims to understand living organisms through mathematically modeling their behaviors at different organizational levels, ranging from molecules to populations. Modeling involves several steps, from determining the model purpose to developing the mathematical model, implementing it computationally, simulating the model's behavior, evaluating, and refining the model. Importantly, model simulation results must be reproducible, ensuring that other researchers can obtain the same results after writing the code de novo and/or using different software tools.
View Article and Find Full Text PDFEnzymes offer a more environmentally friendly and low-impact solution to conventional chemistry, but they often require additional engineering for their application in industrial settings, an endeavour that is challenging and laborious. To address this issue, the power of machine learning can be harnessed to produce predictive models that enable the in silico study and engineering of improved enzymatic properties. Such machine learning models, however, require the conversion of the complex biological information to a numerical input, also called protein representations.
View Article and Find Full Text PDFSummary: Protein design requires information about how mutations affect protein stability. Many web-based predictors are available for this purpose, yet comparing them or using them en masse is difficult. Here, we present BenchStab, a console tool/Python package for easy and quick execution of 19 predictors and result collection on a list of mutants.
View Article and Find Full Text PDFEvery year, more than 19 million cancer cases are diagnosed, and this number continues to increase annually. Since standard treatment options have varying success rates for different types of cancer, understanding the biology of an individual's tumour becomes crucial, especially for cases that are difficult to treat. Personalised high-throughput profiling, using next-generation sequencing, allows for a comprehensive examination of biopsy specimens.
View Article and Find Full Text PDFACS Omega
June 2024
Glycosylation represents a major chemical challenge; while it is one of the most common reactions in Nature, conventional chemistry struggles with stereochemistry, regioselectivity, and solubility issues. In contrast, family 1 glycosyltransferase (GT1) enzymes can glycosylate virtually any given nucleophilic group with perfect control over stereochemistry and regioselectivity. However, the appropriate catalyst for a given reaction needs to be identified among the tens of thousands of available sequences.
View Article and Find Full Text PDFComputational study of the effect of drug candidates on intrinsically disordered biomolecules is challenging due to their vast and complex conformational space. Here, we developed a comparative Markov state analysis (CoVAMPnet) framework to quantify changes in the conformational distribution and dynamics of a disordered biomolecule in the presence and absence of small organic drug candidate molecules. First, molecular dynamics trajectories are generated using enhanced sampling, in the presence and absence of small molecule drug candidates, and ensembles of soft Markov state models (MSMs) are learned for each system using unsupervised machine learning.
View Article and Find Full Text PDFPredictONCO 1.0 is a unique web server that analyzes effects of mutations on proteins frequently altered in various cancer types. The server can assess the impact of mutations on the protein sequential and structural properties and apply a virtual screening to identify potential inhibitors that could be used as a highly individualized therapeutic approach, possibly based on the drug repurposing.
View Article and Find Full Text PDFRecent progress in engineering highly promising biocatalysts has increasingly involved machine learning methods. These methods leverage existing experimental and simulation data to aid in the discovery and annotation of promising enzymes, as well as in suggesting beneficial mutations for improving known targets. The field of machine learning for protein engineering is gathering steam, driven by recent success stories and notable progress in other areas.
View Article and Find Full Text PDFNowadays, the vastly increasing demand for novel biotechnological products is supported by the continuous development of biocatalytic applications that provide sustainable green alternatives to chemical processes. The success of a biocatalytic application is critically dependent on how quickly we can identify and characterize enzyme variants fitting the conditions of industrial processes. While miniaturization and parallelization have dramatically increased the throughput of next-generation sequencing systems, the subsequent characterization of the obtained candidates is still a limiting process in identifying the desired biocatalysts.
View Article and Find Full Text PDFComput Struct Biotechnol J
November 2022
Protein solubility is an attractive engineering target primarily due to its relation to yields in protein production and manufacturing. Moreover, better knowledge of the mutational effects on protein solubility could connect several serious human diseases with protein aggregation. However, we have limited understanding of the protein structural determinants of solubility, and the available data have mostly been scattered in the literature.
View Article and Find Full Text PDFThe importance of the quantitative description of protein unfolding and aggregation for the rational design of stability or understanding the molecular basis of protein misfolding diseases is well established. Protein thermostability is typically assessed by calorimetric or spectroscopic techniques that monitor different complementary signals during unfolding. The CalFitter webserver has already proved integral to deriving invaluable energy parameters by global data analysis.
View Article and Find Full Text PDFTherapeutic enzymes are valuable biopharmaceuticals in various biomedical applications. They have been successfully applied for fibrinolysis, cancer treatment, enzyme replacement therapies, and the treatment of rare diseases. Still, there is a permanent demand to find new or better therapeutic enzymes, which would be sufficiently soluble, stable, and active to meet specific medical needs.
View Article and Find Full Text PDFSubstrate inhibition is the most common deviation from Michaelis-Menten kinetics, occurring in approximately 25% of known enzymes. It is generally attributed to the formation of an unproductive enzyme-substrate complex after the simultaneous binding of two or more substrate molecules to the active site. Here, we show that a single point mutation (L177W) in the haloalkane dehalogenase LinB causes strong substrate inhibition.
View Article and Find Full Text PDFThe majority of naturally occurring proteins have evolved to function under mild conditions inside the living organisms. One of the critical obstacles for the use of proteins in biotechnological applications is their insufficient stability at elevated temperatures or in the presence of salts. Since experimental screening for stabilizing mutations is typically laborious and expensive, in silico predictors are often used for narrowing down the mutational landscape.
View Article and Find Full Text PDFThe rapid accumulation of sequence data and powerful protein engineering techniques providing large mutant libraries have greatly heightened interest in efficient methods for biochemical characterization of proteins. Herein is reported a continuous assay for screening of enzymatic activity. The assay is developed and tested with the model enzymes haloalkane dehalogenases and relies upon a fluorescent change of a derivative of 8-hydroxypyrene-1,3,6-trisulphonic acid due to the pH drop associated with the dehalogenation reactions.
View Article and Find Full Text PDFAnalytical devices that combine sensitive biological component with a physicochemical detector hold a great potential for various applications, e.g., environmental monitoring, food analysis or medical diagnostics.
View Article and Find Full Text PDFDespite significant advances in the understanding of protein structure-function relationships, revealing protein folding pathways still poses a challenge due to a limited number of relevant experimental tools. Widely-used experimental techniques, such as calorimetry or spectroscopy, critically depend on a proper data analysis. Currently, there are only separate data analysis tools available for each type of experiment with a limited model selection.
View Article and Find Full Text PDFFibroblast growth factors (FGFs) serve numerous regulatory functions in complex organisms, and their corresponding therapeutic potential is of growing interest to academics and industrial researchers alike. However, applications of these proteins are limited due to their low stability. Here we tackle this problem using a generalizable computer-assisted protein engineering strategy to create a unique modified FGF2 with nine mutations displaying unprecedented stability and uncompromised biological function.
View Article and Find Full Text PDFStudies of protein unfolding mechanisms are critical for understanding protein functions inside cells, de novo protein design as well as defining the role of protein misfolding in neurodegenerative disorders. Calorimetry has proven indispensable in this regard for recording full energetic profiles of protein unfolding and permitting data fitting based on unfolding pathway models. While both kinetic and thermodynamic protein stability are analysed by varying scan rates and reheating, the latter is rarely used in curve-fitting, leading to a significant loss of information from experiments.
View Article and Find Full Text PDF