Motivation: The aim of quantitative structure-activity prediction (QSAR) studies is to identify novel drug-like molecules that can be suggested as lead compounds by means of two approaches, which are discussed in this article. First, to identify appropriate molecular descriptors by focusing on one feature-selection algorithms; and second to predict the biological activities of designed compounds. Recent studies have shown increased interest in the prediction of a huge number of molecules, known as Big Data, using deep learning models. However, despite all these efforts to solve critical challenges in QSAR models, such as over-fitting, massive processing procedures, is major shortcomings of deep learning models. Hence, finding the most effective molecular descriptors in the shortest possible time is an ongoing task. One of the successful methods to speed up the extraction of the best features from big datasets is the use of least absolute shrinkage and selection operator (LASSO). This algorithm is a regression model that selects a subset of molecular descriptors with the aim of enhancing prediction accuracy and interpretability because of removing inappropriate and irrelevant features.
Results: To implement and test our proposed model, a random forest was built to predict the molecular activities of Kaggle competition compounds. Finally, the prediction results and computation time of the suggested model were compared with the other well-known algorithms, i.e. Boruta-random forest, deep random forest and deep belief network model. The results revealed that improving output correlation through LASSO-random forest leads to appreciably reduced implementation time and model complexity, while maintaining accuracy of the predictions.
Supplementary Information: Supplementary data are available at Bioinformatics online.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1093/bioinformatics/btab659 | DOI Listing |
Sci Rep
January 2025
Department of Pharmacology and Toxicology, College of Pharmacy, King Saud University, 11461, Riyadh, Saudi Arabia.
Quantitative structure-property relationship (QSPR) modeling has emerged as a pivotal tool in the field of medicinal chemistry and drug design, offering a predictive framework for understanding the correlation between chemical structure and physicochemical properties. Topological indices are mathematical descriptors derived from the molecular graphs that capture structural features and connectivity, playing a crucial role in QSPR analysis by quantitatively relating chemical structures to their physicochemical properties and biological activities. Lung cancer is characterized by its aggressive nature and late-stage diagnosis, often limiting treatment options and significantly impacting patient survival rates.
View Article and Find Full Text PDFAJNR Am J Neuroradiol
January 2025
Department of Neurology, Cliniques Universitaires Saint-Luc, Université catholique de Louvain, Brussels, Belgium.
Purpose: Posterior fossa ring-enhancing lesions (PFREL) in the adult immunocompetent hosts pose a diagnostic challenge. We aimed to evaluate the spectrum of PFREL etiologies and propose a diagnostic algorithm.
Methods: This study involved a retrospective analysis of PFREL cases from our institution (January 2023 to April 2024) and a systematic literature review conducted using Embase and PubMed databases following the PRISMA 2020 guidelines.
Mar Pollut Bull
January 2025
School of Resources and Environment, Nanchang University, Key Laboratory of Poyang Lake Environment and Resource Utilization, Ministry of Education, Nanchang 330031, Jiangxi, China.
The widespread use of perfluoro/polyfluoroalkyl compounds (PFACs) makes it inevitable for them to be released into and affect the environment, and the octanol-water partition coefficient (logK) is a key indicator for evaluating the environmental behavior of trace pollutants and their impact on the environment. However, the determination of logK using experimental means is often time-consuming and laborious, or even unattainable. Therefore, the logKow of 20 per/polyfluoroalkyl compounds obtained from the PubChem database was selected as the object of study, and the 41 chemical descriptors required for modeling were obtained by density-functional theory calculations, and it was found that only two molecular descriptors (A, V) were significantly correlated with the logK, with the correlation of the descriptor A being the was the strongest.
View Article and Find Full Text PDFJ Mol Graph Model
January 2025
Unit of Excellence in Computational Molecular Science and Catalysis, and Division of Chemistry, School of Science, University of Phayao, Phayao, 56000, Thailand. Electronic address:
Monomer insertion, leading to the formation of an activated monomer complex, is a critical step in cationic ring-opening polymerization (CROP) of cyclic monomers, such as ε-caprolactone (CL). In this study, Density Functional Theory (DFT) calculations were employed to investigate the structural and electronic properties of four activated complexes at two Zr:B ratios (1:2 and 1:1), where Zr is the cationic zirconocene catalyst, Cp₂ZrMe⁺, and B is the borate cocatalyst, [MeB(CF)] or [B(CF)]. Steric hindrance at the reactive site was analyzed using topographic steric maps, while inter- and intramolecular interactions of the complex systems were examined through the Quantum Theory of Atoms in Molecules (QTAIM) and non-covalent interaction (NCI) analyses.
View Article and Find Full Text PDFSAR QSAR Environ Res
January 2025
Interdisciplinary Nanotoxicity Center, Department of Chemistry, Physics and Atmospheric Sciences, Jackson State University, Jackson, MS, USA.
A scheme for constructing models of the 'structure-glass transition temperature of a polymer' is proposed. It involves the representation of the molecular structure of a polymer through the architecture of monomer units represented through a simplified molecular input-line entry system (SMILES) and the fragments of local symmetry (FLS). The statistical quality of such models is quite good: the determination coefficient values for active training set, passive training set, calibration set, and validation set are 0.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!