Extracting molecular descriptors from chemical compounds is an essential preprocessing phase for developing accurate classification models. Supervised machine learning algorithms offer the capability to detect "hidden" patterns that may exist in a large dataset of compounds, which are represented by their molecular descriptors. Assuming that molecules with similar structure tend to share similar physicochemical properties, large chemical libraries can be screened by applying similarity sourcing techniques in order to detect potential bioactive compounds against a molecular target. However, the process of generating these compound features is time-consuming. Our proposed methodology not only employs cloud computing to accelerate the process of extracting molecular descriptors but also introduces an optimized approach to utilize the computational resources in the most efficient way.

Download full-text PDF

Source
http://dx.doi.org/10.1007/978-3-031-31982-2_28DOI Listing

Publication Analysis

Top Keywords

molecular descriptors
16
extracting molecular
12
cloud computing
8
molecular
5
optimized cloud
4
computing method
4
method extracting
4
descriptors
4
descriptors extracting
4
descriptors chemical
4

Similar Publications

Driven by the growing demands for plant-based protein in Europe and attempts of soybean breeding programs to improve the productivity of created varieties, this study aimed to enhance genetic resource utilization efficiency by providing information relevant to well-focused breeding targets. A set of 90 accessions was subjected to a comprehensive assessment of genetic diversity in a soybean working collection using three marker types: morphological descriptors, agronomic traits, and SSRs. Genotype grouping patterns varied among the markers, displaying the best congruence with pedigree data and maturity for SSRs and agronomic traits, respectively.

View Article and Find Full Text PDF

A Simple Machine Learning-Based Quantitative Structure-Activity Relationship Model for Predicting pIC Inhibition Values of FLT3 Tyrosine Kinase.

Pharmaceuticals (Basel)

January 2025

Centro de Química Médica, Facultad de Medicina Clínica Alemana, Universidad del Desarrollo, Santiago 7780272, Chile.

Acute myeloid leukemia (AML) presents significant therapeutic challenges, particularly in cases driven by mutations in the FLT3 tyrosine kinase. This study aimed to develop a robust and user-friendly machine learning-based quantitative structure-activity relationship (QSAR) model to predict the inhibitory potency (pIC values) of FLT3 inhibitors, addressing the limitations of previous models in dataset size, diversity, and predictive accuracy. Using a dataset which was 14 times larger than those employed in prior studies (1350 compounds with 1269 molecular descriptors), we trained a random forest regressor, chosen due to its superior predictive performance and resistance to overfitting.

View Article and Find Full Text PDF

Background/objectives: Developing antifungal drugs with lower potential for interactions with food may help to optimize treatment and reduce the risk of antimicrobial resistance. Chemometrics uses statistical and mathematical methods to analyze multivariate chemical data, enabling the identification of key correlations and simplifying data interpretation. We used the partial least squares (PLS) approach to explore the correlations between various characteristics of oral antifungal drugs (including antifungal antibiotics) and dietary interventions, aiming to identify patterns that could inform the optimization of antifungal therapy.

View Article and Find Full Text PDF

Unlocking the Potential of RNA Sequencing in COVID-19: Toward Accurate Diagnosis and Personalized Medicine.

Diagnostics (Basel)

January 2025

Division of Microbiology, Immunology and Biotechnology, Department of Natural Products and Alternative Medicine, Faculty of Pharmacy, University of Tabuk, Tabuk 71491, Saudi Arabia.

COVID-19 has caused widespread morbidity and mortality, with its effects extending to multiple organ systems. Despite known risk factors for severe disease, including advanced age and underlying comorbidities, patient outcomes can vary significantly. This variability complicates efforts to predict disease progression and tailor treatment strategies.

View Article and Find Full Text PDF

Memory is a dynamic process of encoding, storing, and retrieving information. It includes sensory, short-term, and long-term memory, each with unique characteristics. Nitric oxide (NO) is a biological messenger synthesized on demand by neuronal nitric oxide synthase (nNOS) through a biochemical process initiated by glutamate binding to NMDA receptors, causing membrane depolarization and calcium influx.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!