Recent thermodynamic and functional studies have been conducted to evaluate the impact of amino acid substitutions on Calmodulin (CaM). The Critical Assessment of Genome Interpretation (CAGI) data provider at University of Verona (Italy) measured the melting temperature (T) and the percentage of unfolding (%unfold) of a set of CaM variants (CaM challenge dataset). Thermodynamic measurements for the equilibrium unfolding of CaM were obtained by monitoring far-UV Circular Dichroism as a function of temperature.
View Article and Find Full Text PDFAntibodies are proteins produced by the immune system that can identify and neutralise a wide variety of antigens with high specificity and affinity, and constitute the most successful class of biotherapeutics. With the advent of next-generation sequencing, billions of antibody sequences have been collected in recent years, though their application in the design of better therapeutics has been constrained by the sheer volume and complexity of the data. To address this challenge, we present IgBert and IgT5, the best performing antibody-specific language models developed to date which can consistently handle both paired and unpaired variable region sequences as input.
View Article and Find Full Text PDFComput Struct Biotechnol J
December 2024
Current medical research has been demonstrating the roles of miRNAs in a variety of cellular mechanisms, lending credence to the association between miRNA dysregulation and multiple diseases. Understanding the mechanisms of miRNA is critical for developing effective diagnostic and therapeutic strategies. miRNA-mRNA interactions emerge as the most important mechanism to be understood despite their experimental validation constraints.
View Article and Find Full Text PDFWhile drug combination therapies are of great importance, particularly in cancer treatment, identifying novel synergistic drug combinations has been a challenging venture. Computational methods have emerged in this context as a promising tool for prioritizing drug combinations for further evaluation, though they have presented limited performance, utility, and interpretability. Here, we propose a novel predictive tool, piscesCSM, that leverages graph-based representations to model small molecule chemical structures to accurately predict drug combinations with favourable anticancer synergistic effects against one or multiple cancer cell lines.
View Article and Find Full Text PDFG protein-coupled receptors (GPCRs) are one of the most important families of targets for drug discovery. One of the limiting steps in the study of GPCRs has been their stability, with significant and time-consuming protein engineering often used to stabilize GPCRs for structural characterization and drug screening. Unfortunately, computational methods developed using globular soluble proteins have translated poorly to the rational engineering of GPCRs.
View Article and Find Full Text PDFStud Health Technol Inform
January 2024
The Learning Health Systems (LHS) framework demonstrates the potential for iterative interrogation of health data in real time and implementation of insights into practice. Yet, the lack of appropriately skilled workforce results in an inability to leverage existing data to design innovative solutions. We developed a tailored professional development program to foster a skilled workforce.
View Article and Find Full Text PDFThis article investigates the role of recent advances in Artificial Intelligence (AI) to revolutionise the study of G protein-coupled receptors (GPCRs). AI has been applied to many areas of GPCR research, including the application of machine learning (ML) in GPCR classification, prediction of GPCR activation levels, modelling GPCR 3D structures and interactions, understanding G-protein selectivity, aiding elucidation of GPCRs structures, and drug design. Despite progress, challenges in predicting GPCR structures and addressing the complex nature of GPCRs remain, providing avenues for future research and development.
View Article and Find Full Text PDFGlycoside hydrolases (GHs) are a diverse group of enzymes that catalyze the hydrolysis of glycosidic bonds. The Carbohydrate-Active enZymes (CAZy) classification organizes GHs into families based on sequence data and function, with fewer than 1% of the predicted proteins characterized biochemically. Consideration of genomic context can provide clues to infer possible enzyme activities for proteins of unknown function.
View Article and Find Full Text PDFBackground: Mutations within the Von Hippel-Lindau (VHL) tumor suppressor gene are known to cause VHL disease, which is characterized by the formation of cysts and tumors in multiple organs of the body, particularly clear cell renal cell carcinoma (ccRCC). A major challenge in clinical practice is determining tumor risk from a given mutation in the VHL gene. Previous efforts have been hindered by limited available clinical data and technological constraints.
View Article and Find Full Text PDFIntroduction: Adequate methods to promptly translate digital health innovations for improved patient care are essential. Advances in Artificial Intelligence (AI) and Machine Learning (ML) have been sources of digital innovation and hold the promise to revolutionize the way we treat, manage and diagnose patients. Understanding the benefits but also the potential adverse effects of digital health innovations, particularly when these are made available or applied on healthier segments of the population is essential.
View Article and Find Full Text PDFMotivation: With the development of sequencing techniques, the discovery of new proteins significantly exceeds the human capacity and resources for experimentally characterizing protein functions. Localization, EC numbers, and GO terms with the structure-based Cutoff Scanning Matrix (LEGO-CSM) is a comprehensive web-based resource that fills this gap by leveraging the well-established and robust graph-based signatures to supervised learning models using both protein sequence and structure information to accurately model protein function in terms of Subcellular Localization, Enzyme Commission (EC) numbers, and Gene Ontology (GO) terms.
Results: We show our models perform as well as or better than alternative approaches, achieving area under the receiver operating characteristic curve of up to 0.
Motivation: While antibodies have been ground-breaking therapeutic agents, the structural determinants for antibody binding specificity remain to be fully elucidated, which is compounded by the virtually unlimited repertoire of antigens they can recognize. Here, we have explored the structural landscapes of antibody-antigen interfaces to identify the structural determinants driving target recognition by assessing concavity and interatomic interactions.
Results: We found that complementarity-determining regions utilized deeper concavity with their longer H3 loops, especially H3 loops of nanobody showing the deepest use of concavity.
Understanding the effects of mutations on protein stability is crucial for variant interpretation and prioritisation, protein engineering, and biotechnology. Despite significant efforts, community assessments of predictive tools have highlighted ongoing limitations, including computational time, low predictive power, and biased predictions towards destabilising mutations. To fill this gap, we developed DDMut, a fast and accurate siamese network to predict changes in Gibbs Free Energy upon single and multiple point mutations, leveraging both forward and hypothetical reverse mutations to account for model anti-symmetry.
View Article and Find Full Text PDFThe ability to identify B-cell epitopes is an essential step in vaccine design, immunodiagnostic tests and antibody production. Several computational approaches have been proposed to identify, from an antigen protein or peptide sequence, which residues are more likely to be part of an epitope, but have limited performance on relatively homogeneous data sets and lack interpretability, limiting biological insights that could otherwise be obtained. To address these limitations, we have developed epitope1D, an explainable machine learning method capable of accurately identifying linear B-cell epitopes, leveraging two new descriptors: a graph-based signature representation of protein sequences, based on our well-established Cutoff Scanning Matrix algorithm and Organism Ontology information.
View Article and Find Full Text PDFTeratogenic drugs can lead to extreme fetal malformation and consequently critically influence the fetus's health, yet the teratogenic risks associated with most approved drugs are unknown. Here, we propose a novel predictive tool, embryoTox, which utilizes a graph-based signature representation of the chemical structure of a small molecule to predict and classify molecules likely to be safe during pregnancy. embryoTox was trained and validated using bioactivity data of over 700 small molecules with characterized teratogenicity effects.
View Article and Find Full Text PDFThe SARS-CoV-2 Africa dashboard is an interactive tool that enables visualization of SARS-CoV-2 genomic information in African countries. The customizable app allows users to visualize the number of sequences deposited in each country, and the variants circulating over time. Our dashboard enables near real-time exploration of public data that can inform policymakers, healthcare professionals and the public about the ongoing pandemic.
View Article and Find Full Text PDFThe detection of adverse drug reactions (ADRs) is critical to our understanding of the safety and risk-benefit profile of medications. With an incidence that has not changed over the last 30 years, ADRs are a significant source of patient morbidity, responsible for 5%-10% of acute care hospital admissions worldwide. Spontaneous reporting of ADRs has long been the standard method of reporting, however this approach is known to have high rates of under-reporting, a problem that limits pharmacovigilance efforts.
View Article and Find Full Text PDFAm J Hum Genet
December 2022
Heterozygous pathogenic variants in DNM1 cause developmental and epileptic encephalopathy (DEE) as a result of a dominant-negative mechanism impeding vesicular fission. Thus far, pathogenic variants in DNM1 have been studied with a canonical transcript that includes the alternatively spliced exon 10b. However, after performing RNA sequencing in 39 pediatric brain samples, we find the primary transcript expressed in the brain includes the downstream exon 10a instead.
View Article and Find Full Text PDFAntibodies are essential experimental and diagnostic tools and as biotherapeutics have significantly advanced our ability to treat a range of diseases. With recent innovations in computational tools to guide protein engineering, we can now rationally design better antibodies with improved efficacy, stability, and pharmacokinetics. Here, we describe the use of the mCSM web-based in silico suite, which uses graph-based signatures to rapidly identify the structural and functional consequences of mutations, to guide rational antibody engineering to improve stability, affinity, and specificity.
View Article and Find Full Text PDFNat Struct Mol Biol
November 2022
Protein phosphorylation acts as an essential on/off switch in many cellular signaling pathways. This has led to ongoing interest in targeting kinases for therapeutic intervention. Computer-aided drug discovery has been proven a useful and cost-effective approach for facilitating prioritization and enrichment of screening libraries, but limited effort has been devoted providing insights on what makes a potent kinase inhibitor.
View Article and Find Full Text PDFThe design of novel, safe, and effective drugs to treat human diseases is a challenging venture, with toxicity being one of the main sources of attrition at later stages of development. Failure due to toxicity incurs a significant increase in costs and time to market, with multiple drugs being withdrawn from the market due to their adverse effects. Cardiotoxicity, for instance, was responsible for the failure of drugs such as fenspiride, propoxyphene, and valdecoxib.
View Article and Find Full Text PDFBackground: Amyotrophic lateral sclerosis (ALS) is a progressively fatal, neurodegenerative disease associated with both motor and non-motor symptoms, including frontotemporal dementia. Approximately 10% of cases are genetically inherited (familial ALS), while the majority are sporadic. Mutations across a wide range of genes have been associated; however, the underlying molecular effects of these mutations and their relation to phenotypes remain poorly explored.
View Article and Find Full Text PDFPeptides are attractive alternatives for the development of new therapeutic strategies due to their versatility and low complexity of synthesis. Increasing interest in these molecules has led to the creation of large collections of experimentally characterized therapeutic peptides, which greatly contributes to development of data-driven computational approaches. Here we propose CSM-peptides, a novel machine learning method for rapid identification of eight different types of therapeutic peptides: anti-angiogenic, anti-bacterial, anti-cancer, anti-inflammatory, anti-viral, cell-penetrating, quorum sensing, and surface binding.
View Article and Find Full Text PDFDrug discovery is a lengthy, costly and high-risk endeavour that is further convoluted by high attrition rates in later development stages. Toxicity has been one of the main causes of failure during clinical trials, increasing drug development time and costs. To facilitate early identification and optimisation of toxicity profiles, several computational tools emerged aiming at improving success rates by timely pre-screening drug candidates.
View Article and Find Full Text PDF