Publications by authors named "Janani Durairaj"

Unlabelled: BK polyomavirus (BKPyV) is a double-stranded DNA virus causing nephropathy, hemorrhagic cystitis, and urothelial cancer in transplant patients. The BKPyV-encoded capsid protein Vp1 and large T-antigen (LTag) are key targets of neutralizing antibodies and cytotoxic T-cells, respectively. Our single-center data suggested that variability in Vp1 and LTag may contribute to failing BKPyV-specific immune control and impact vaccine design.

View Article and Find Full Text PDF

Motivation: Language models are routinely used for text classification and generative tasks. Recently, the same architectures were applied to protein sequences, unlocking powerful new approaches in the bioinformatics field. Protein language models (pLMs) generate high-dimensional embeddings on a per-residue level and encode a "semantic meaning" of each individual amino acid in the context of the full protein sequence.

View Article and Find Full Text PDF

The prediction of protein-ligand complexes (PLC), using both experimental and predicted structures, is an active and important area of research, underscored by the inclusion of the Protein-Ligand Interaction category in the latest round of the Critical Assessment of Protein Structure Prediction experiment CASP15. The prediction task in CASP15 consisted of predicting both the three-dimensional structure of the receptor protein as well as the position and conformation of the ligand. This paper addresses the challenges and proposed solutions for devising automated benchmarking techniques for PLC prediction.

View Article and Find Full Text PDF

CASP15 introduced a new category, ligand prediction, where participants were provided with a protein or nucleic acid sequence, SMILES line notation, and stoichiometry for ligands and tasked with generating computational models for the three-dimensional structure of the corresponding protein-ligand complex. These models were subsequently compared with experimental structures determined by x-ray crystallography or cryoEM. To assess these predictions, two novel scores were developed.

View Article and Find Full Text PDF

We are now entering a new era in protein sequence and structure annotation, with hundreds of millions of predicted protein structures made available through the AlphaFold database. These models cover nearly all proteins that are known, including those challenging to annotate for function or putative biological role using standard homology-based approaches. In this study, we examine the extent to which the AlphaFold database has structurally illuminated this 'dark matter' of the natural protein universe at high predicted accuracy.

View Article and Find Full Text PDF

Developments in computational omics technologies have provided new means to access the hidden diversity of natural products, unearthing new potential for drug discovery. In parallel, artificial intelligence approaches such as machine learning have led to exciting developments in the computational drug design field, facilitating biological activity prediction and de novo drug design for molecular targets of interest. Here, we describe current and future synergies between these developments to effectively identify drug candidates from the plethora of molecules produced by nature.

View Article and Find Full Text PDF
Article Synopsis
  • The analysis focuses on CASP15 targets, emphasizing their biological importance and functional roles within protein structures.
  • Authors assess key protein features and how well these were represented in the submitted predictions, noting successes and consistent challenges.
  • The text highlights the necessity for improved scoring strategies and the future need for integrating computational methods with experimental techniques in structural molecular biology.
View Article and Find Full Text PDF

Prediction categories in the Critical Assessment of Structure Prediction (CASP) experiments change with the need to address specific problems in structure modeling. In CASP15, four new prediction categories were introduced: RNA structure, ligand-protein complexes, accuracy of oligomeric structures and their interfaces, and ensembles of alternative conformations. This paper lists technical specifications for these categories and describes their integration in the CASP data management system.

View Article and Find Full Text PDF

Recent breakthroughs in protein structure prediction demarcate the start of a new era in structural bioinformatics. Combined with various advances in experimental structure determination and the uninterrupted pace at which new structures are published, this promises an age in which protein structure information is as prevalent and ubiquitous as sequence. Machine learning in protein bioinformatics has been dominated by sequence-based methods, but this is now changing to make use of the deluge of rich structural information as input.

View Article and Find Full Text PDF
Article Synopsis
  • Most proteins fold into unique 3D shapes that dictate their functions within cells, and new computational methods, like AlphaFold2, have achieved high accuracy in predicting these structures, rivaling experimental results.
  • The study evaluates AlphaFold2's effectiveness in various applications, such as analyzing protein features, understanding how mutations affect function, and modeling interactions and experimental data.
  • It concludes that AlphaFold2 can model more structural details than traditional methods and performs well across different research applications, potentially transforming the field of structural biology and life sciences.
View Article and Find Full Text PDF

Strigolactones (SLs) are rhizosphere signalling molecules and phytohormones. The biosynthetic pathway of SLs in tomato has been partially elucidated, but the structural diversity in tomato SLs predicts that additional biosynthetic steps are required. Here, root RNA-seq data and co-expression analysis were used for SL biosynthetic gene discovery.

View Article and Find Full Text PDF

Sesquiterpene synthases (STSs) catalyze the formation of a large class of plant volatiles called sesquiterpenes. While thousands of putative STS sequences from diverse plant species are available, only a small number of them have been functionally characterized. Sequence identity-based screening for desired enzymes, often used in biotechnological applications, is difficult to apply here as STS sequence similarity is strongly affected by species.

View Article and Find Full Text PDF

Motivation: As the number of experimentally solved protein structures rises, it becomes increasingly appealing to use structural information for predictive tasks involving proteins. Due to the large variation in protein sizes, folds and topologies, an attractive approach is to embed protein structures into fixed-length vectors, which can be used in machine learning algorithms aimed at predicting and understanding functional and physical properties. Many existing embedding approaches are alignment based, which is both time-consuming and ineffective for distantly related proteins.

View Article and Find Full Text PDF
Article Synopsis
  • Plant terpene synthases (TPSs) are enzymes that produce a wide variety of terpenes, influencing the unique chemical makeup of different plant species.
  • In this study, researchers examined two specific TPSs from the Camphor tree: CiCaMS, which produces myrcene (a monoterpene), and CiCaSSy, which produces α-santalene, β-santalene, and trans-α-bergamotene (sesquiterpenes).
  • Despite sharing 97% DNA sequence similarity, the two enzymes differ in only 22 amino acids, and further analysis identified key residues that determine whether a TPS will produce monoterpenes or a specific sesquiterpene product profile.
View Article and Find Full Text PDF

The vast number of protein structures currently available opens exciting opportunities for machine learning on proteins, aimed at predicting and understanding functional properties. In particular, in combination with homology modelling, it is now possible to not only use sequence features as input for machine learning, but also structure features. However, in order to do so, robust multiple structure alignments are imperative.

View Article and Find Full Text PDF

Plants exhibit a vast array of sesquiterpenes, C15 hydrocarbons which often function as herbivore-repellents or pollinator-attractants. These in turn are produced by a diverse range of sesquiterpene synthases. A comprehensive analysis of these enzymes in terms of product specificity has been hampered by the lack of a centralized resource of sufficient functionally annotated sequence data.

View Article and Find Full Text PDF