Publications by authors named "Tuomo Kalliokoski"

Given the size of the relevant chemical space for drug discovery, working with fully enumerated compound libraries (especially in three-dimensional (3D)) is unfeasible. Nonenumerated virtual chemical spaces are a practical solution to this issue, where compounds are described as building blocks which are then connected by rules. One concrete example of such is the BioSolveIT chemical spaces file format (.

View Article and Find Full Text PDF

The emergence of ultra-large screening libraries, filled to the brim with billions of readily available compounds, poses a growing challenge for docking-based virtual screening. Machine learning (ML)-boosted strategies like the tool HASTEN combine rapid ML prediction with the brute-force docking of small fractions of such libraries to increase screening throughput and take on giga-scale libraries. In our case study of an anti-bacterial chaperone and an anti-viral kinase, we first generated a brute-force docking baseline for 1.

View Article and Find Full Text PDF

Methionine adenosyltransferase 2A (MAT2A) has been indicated as a drug target for oncology indications. Clinical trials with MAT2A inhibitors are currently on-going. Here, a structure-based virtual screening campaign was performed on the commercially available chemical space which yielded two novel MAT2A-inhibitor chemical series.

View Article and Find Full Text PDF

An intronic (GC) expansion in C9orf72 causes amyotrophic lateral sclerosis and frontotemporal dementia primarily through gain-of-function mechanisms: the accumulation of sense and antisense repeat RNA foci and dipeptide repeat (DPR) proteins (poly-GA/GP/GR/PA/PR) translated from repeat RNA. To therapeutically block this pathway, we screen a library of 1,430 approved drugs and known bioactive compounds in patient-derived induced pluripotent stem cell-derived neurons (iPSC-Neurons) for inhibitors of DPR expression. The clinically used guanosine/cytidine analogs decitabine, entecavir, and nelarabine reduce poly-GA/GP expression, with decitabine being the most potent.

View Article and Find Full Text PDF

The software macHine leArning booSTEd dockiNg (HASTEN) was developed to accelerate structure-based virtual screening using machine learning models. It has been validated using datasets both from literature (12 datasets, each containing three million molecules docked with FRED) and in-house sources (one dataset of four million compounds docked with Glide). HASTEN showed reasonable performance by having the mean recall value of 0.

View Article and Find Full Text PDF

Human kynurenine aminotransferase 2 (KAT2) inhibitors could be potentially used to treat the cognitive deficits associated with bipolar disease and schizophrenia. Although, there has been active drug research activity by several industrial and academic groups in developing KAT2 inhibitors over the years, no such compound has proceeded to the clinics. Here, we report two different chemical series of reversible KAT2 inhibitors with sub-micromolar activities.

View Article and Find Full Text PDF

Applications of ecosystem flux models on large geographical scales are often limited by model complexity and data availability. Here we calibrated and evaluated a semi-empirical ecosystem flux model, PREdict Light-use efficiency, Evapotranspiration and Soil water (PRELES), for various forest types and climate conditions, based on eddy covariance data from 55 sites. A Bayesian approach was adopted for model calibration and uncertainty quantification.

View Article and Find Full Text PDF

Metabolic scaling theory allows us to link plant hydraulic structure with metabolic rates in a quantitative framework. In this theoretical framework, we considered the hydraulic structure of current-year shoots in Pinus sylvestris and Picea abies, focusing on two properties unaccounted for by metabolic scaling theories: conifer needles are attached to the entire length of shoots, and the shoot as a terminal element does not display invariant properties. We measured shoot length and diameter as well as conduit diameter and density in two locations of 14 current-year non-leader shoots of pine and spruce saplings, and calculated conductivities of shoots from measured conduit properties.

View Article and Find Full Text PDF

Data fusion approach was investigated in the context of pK prediction for 391 small molecules derived from a public data source as well as for 681 compounds from an internal corporate database. Four different pKa prediction methods (Simulations Plus ADMET-Predictor S+pKa, ACD/Labs Percepta Classic, ACD/Labs Percepta GALAS and Epik) were used to predict the most acidic or basic pKa for each of the compounds. By using data fusion, the median absolute error for the internal compounds was reduced from the best performing single model's value of 0.

View Article and Find Full Text PDF

The identification of high-quality starting points for drug discovery is an enduring challenge in medicinal chemistry. Yet, the chemical space explored in discovery programmes tends be limited by the narrow toolkit of robust methods that are exploited in discovery workflows. The European Lead Factory (ELF) was established in 2013 to boost early-stage drug discovery within Europe.

View Article and Find Full Text PDF

The availability of high-quality screening compounds is of paramount importance for the discovery of innovative new medicines. Natural product (NP) frameworks can inspire the design of productive compound libraries. Here, we describe the design and synthesis of four compound libraries based on scaffolds that have broad NP-like features, but that are only distantly related to specific NPs.

View Article and Find Full Text PDF

The relationship between the growth rate of aboveground parts of trees and fine root development is largely unknown. We investigated the early root development of fast- and slow-growing Norway spruce (Picea abies (L.) H.

View Article and Find Full Text PDF

High-throughput screening (HTS) represents a major cornerstone of drug discovery. The availability of an innovative, relevant and high-quality compound collection to be screened often dictates the final fate of a drug discovery campaign. Given that the chemical space to be sampled in research programs is practically infinite and sparsely populated, significant efforts and resources need to be invested in the generation and maintenance of a competitive compound collection.

View Article and Find Full Text PDF

Combinatorial libraries are synthesized by combining smaller reagents (building blocks), the price of which is an important component of the total costs associated with the synthetic exercise. A significant portion of commercially available reagents are too expensive for large scale work. In this study, 13 commonly used reagent classes in combinatorial library synthesis (primary and secondary amines, carboxylic acids, alcohols, ketones, aldehydes, boronic acids, acyl halides, sulfonyl chlorides, isocyanates, isothiocyanates, azides and chloroformates) were analyzed with respect to the cost, physicochemical properties (molecular weight and calculated logP), chemical diversity, and 3D-likeness using a large data set.

View Article and Find Full Text PDF

The application of [4+2] cycloadditions between alkenes and an N-benzoyl iminium species, generated in situ under acidic conditions, is described in the synthesis of diverse molecular scaffolds. The key reaction led to the formation of cyclic imidates in good yield and with high regioselectivity. It was demonstrated that the cyclic imidates may be readily converted into 1,3-amino alcohols.

View Article and Find Full Text PDF

The design, synthesis and decoration of six small molecule libraries is described. Each library was inspired by structures embedded in the framework of specific alkaloid natural products. The development of optimised syntheses of the required molecular scaffolds is described, in which reactions including Pd-catalysed aminoarylation and diplolar cycloadditions have been exploited as key steps.

View Article and Find Full Text PDF

The key concept in chemogenomics is the similarity principle that states that similar ligands should bind similar targets. Chemogenomic analysis requires large amounts of data and both powerful computational algorithms and computers. Data used for chemogenomics analysis can either be compiled from open sources, or they can be produced in-house as is often done in the pharmaceutical industry.

View Article and Find Full Text PDF

In the Nordic countries, growth of Norway spruce (Picea abies (L.) Karst.) is generally limited by low availability of nutrients, especially nitrogen.

View Article and Find Full Text PDF

The biochemical half maximal inhibitory concentration (IC50) is the most commonly used metric for on-target activity in lead optimization. It is used to guide lead optimization, build large-scale chemogenomics analysis, off-target activity and toxicity models based on public data. However, the use of public biochemical IC50 data is problematic, because they are assay specific and comparable only under certain conditions.

View Article and Find Full Text PDF

Although two binding sites might be dissimilar overall, they might still bind the same fragments if they share suitable subpockets. Information about shared subpockets can be therefore used in fragment-based drug design to suggest new fragments or to replace existing fragments within an already known compound. A novel computational method called SubCav is described which allows the similarity searching and alignment of subpockets from a PDB-wide database against a user-defined query.

View Article and Find Full Text PDF

3D ligand-based virtual screening was employed to identify novel scaffolds for cannabinoid receptor ligand development. A total of 112 compounds with diverse structures were purchased from commercial vendors. 12 CB1 receptor antagonists/inverse agonists and 10 CB2 receptor agonists were identified in vitro.

View Article and Find Full Text PDF

Novel computational methods for understanding relationships between ligands and all possible biological targets have emerged in recent years. Proteins are connected to each other based on the similarity of their ligands or based on the similarity of their binding sites. The assumption is that compounds sharing chemical similarity should share targets and that targets with a similar binding site should also share ligands.

View Article and Find Full Text PDF

Background: Detailed and systematic understanding of the biological effects of millions of available compounds on living cells is a significant challenge. As most compounds impact multiple targets and pathways, traditional methods for analyzing structure-function relationships are not comprehensive enough. Therefore more advanced integrative models are needed for predicting biological effects elicited by specific chemical features.

View Article and Find Full Text PDF

The maximum achievable accuracy of in silico models depends on the quality of the experimental data. Consequently, experimental uncertainty defines a natural upper limit to the predictive performance possible. Models that yield errors smaller than the experimental uncertainty are necessarily overtrained.

View Article and Find Full Text PDF