The ChemicalToolbox: reproducible, user-friendly cheminformatics analysis on the Galaxy platform.

Simon A Bray Xavier Lucas Anup Kumar Björn A Grüning

J Cheminform

Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, Freiburg, Germany.

Published: June 2020

Here, we introduce the ChemicalToolbox, a publicly available web server for performing cheminformatics analysis. The ChemicalToolbox provides an intuitive, graphical interface for common tools for downloading, filtering, visualizing and simulating small molecules and proteins. The ChemicalToolbox is based on Galaxy, an open-source web-based platform which enables accessible and reproducible data analysis. There is already an active Galaxy cheminformatics community using and developing tools. Based on their work, we provide four example workflows which illustrate the capabilities of the ChemicalToolbox, covering assembly of a compound library, hole filling, protein-ligand docking, and construction of a quantitative structure-activity relationship (QSAR) model. These workflows may be modified and combined flexibly, together with the many other tools available, to fit the needs of a particular project. The ChemicalToolbox is hosted on the European Galaxy server and may be accessed via https://cheminformatics.usegalaxy.eu .

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7268608	PMC
http://dx.doi.org/10.1186/s13321-020-00442-7	DOI Listing

Publication Analysis

Top Keywords

cheminformatics analysis

chemicaltoolbox

chemicaltoolbox reproducible

reproducible user-friendly

user-friendly cheminformatics

galaxy

analysis galaxy

galaxy platform

platform introduce

introduce chemicaltoolbox

Similar Publications

Improving Molecular Design with Direct Inverse Analysis of QSAR/QSPR Model.

Mol Inform

January 2025

Department of Applied Chemistry, School of Science and Technology, Meiji University, 1-1-1 Higashi-Mita, Tama-ku, Kawasaki, Kanagawa 214-8571, Japan.

Yuto Shino Hiromasa Kaneko

Recent advances in machine learning have significantly impacted molecular design, notably the molecular generation method combining the chemical variational autoencoder (VAE) with Gaussian mixture regression (GMR). In this method, a mathematical model is constructed with X as the latent variable of the molecule and Y as the target properties and activities. Through direct inverse analysis of this model, it is possible to generate molecules with the desired target properties.

View Article and Find Full Text PDF

Similar Publications

The Venus score for the assessment of the quality and trustworthiness of biomedical datasets.

BioData Min

January 2025

Fondazione Bruno Kessler, Trento, Italy.

Davide Chicco Alessandro Fabris Giuseppe Jurman

Biomedical datasets are the mainstays of computational biology and health informatics projects, and can be found on multiple data platforms online or obtained from wet-lab biologists and physicians. The quality and the trustworthiness of these datasets, however, can sometimes be poor, producing bad results in turn, which can harm patients and data subjects. To address this problem, policy-makers, researchers, and consortia have proposed diverse regulations, guidelines, and scores to assess the quality and increase the reliability of datasets.

View Article and Find Full Text PDF

Similar Publications

A Topology-Enhanced Multi-Viewed Contrastive Approach for Molecular Graph Representation Learning and Classification.

Mol Inform

January 2025

Faculty of Information Technology, HUTECH University, 700000, Ho Chi Minh City, Vietnam.

Phu Pham

In recent times, graph representation learning has been becoming a hot research topic which has attracted a lot of attention from researchers. Graph embeddings have diverse applications across fields such as information and social network analysis, bioinformatics and cheminformatics, natural language processing (NLP), and recommendation systems. Among the advanced deep learning (DL) based architectures used in graph representation learning, graph neural networks (GNNs) have emerged as the dominant and highly effective framework.

View Article and Find Full Text PDF

Similar Publications

Chemical and Biological Investigations of Antiviral Agents Against Plant Viruses Conducted in China in the 21st Century.

Genes (Basel)

December 2024

State Key Laboratory of Green Pesticides, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Guizhou University, Guiyang 550025, China.

Yuanyou Yang Lei Hu Tongtong Chen Libo Zhang Delu Wang

Research into the biology of plant viruses, their mechanisms of pathogenicity, and the induction of host resistance has laid a solid foundation for the discovery of antiviral agents and their targets and the development of effective control technologies. Additionally, recent advancements in fields such as chemical biology, cheminformatics, bioinformatics, and synthetic biology have provided valuable methods and tools for the design of antiviral drugs, the synthesis of drug molecules, assessment of their activity, and investigation of their modes of action. Compared with drug development for human viral diseases, the control of plant viral diseases presents greater challenges, including the cost-benefit of agents, simplification of control technologies, and the effectiveness of treatments.

View Article and Find Full Text PDF

Similar Publications

Machine learning assisted classification RASAR modeling for the nephrotoxicity potential of a curated set of orally active drugs.

Sci Rep

January 2025

Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700 032, India.

Arkaprava Banerjee Kunal Roy

We have adopted the classification Read-Across Structure-Activity Relationship (c-RASAR) approach in the present study for machine-learning (ML)-based model development from a recently reported curated dataset of nephrotoxicity potential of orally active drugs. We initially developed ML models using nine different algorithms separately on topological descriptors (referred to as simply "descriptors" in the subsequent sections of the manuscript) and MACCS fingerprints (referred to as "fingerprints" in the subsequent sections of the manuscript), thus generating 18 different ML QSAR models. Using the chemical spaces defined by the modeling descriptors and fingerprints, the similarity and error-based RASAR descriptors were computed, and the most discriminating RASAR descriptors were used to develop another set of 18 different ML c-RASAR models.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!