Induction of decision trees via evolutionary programming.

J Chem Inf Comput Sci

Department of Molecular Modeling, Pharmacopeia, P.O. Box 5350, Princeton, New Jersey 08543-5350, USA.

Published: March 2005

Decision trees have been used extensively in cheminformatics for modeling various biochemical endpoints including receptor-ligand binding, ADME properties, environmental impact, and toxicity. The traditional approach to inducing decision trees based upon a given training set of data involves recursive partitioning which selects partitioning variables and their values in a greedy manner to optimize a given measure of purity. This methodology has numerous benefits including classifier interpretability and the capability of modeling nonlinear relationships. The greedy nature of induction, however, may fail to elucidate underlying relationships between the data and endpoints. Using evolutionary programming, decision trees are induced which are significantly more accurate than trees induced by recursive partitioning. Furthermore, when assessed on previously unseen data in a 10-fold cross-validated manner, evolutionary programming induced trees exhibit a significantly higher accuracy on previously unseen data. This methodology is compared to single-tree and multiple-tree recursive partitioning in two domains (aerobic biodegradability and hepatotoxicity) and shown to produce less complex classifiers with average increases in predictive accuracy of 5-10% over the traditional method.

Download full-text PDF

Source
http://dx.doi.org/10.1021/ci034188sDOI Listing

Publication Analysis

Top Keywords

decision trees
16
evolutionary programming
12
recursive partitioning
12
programming decision
8
trees induced
8
unseen data
8
trees
6
induction decision
4
trees evolutionary
4
trees extensively
4

Similar Publications

The Use of Novel Alginate Capsules in a Monitoring System for in a Cherry Orchard in the Region of La Araucanía, Chile.

Insects

December 2024

Applied Entomology Laboratory, Facultad de Ciencias Agropecuarias y Medioambiente, Universidad de La Frontera, Temuco 4811230, Chile.

Spotted wing drosophila (SWD) is a pest that causes damage due to the female laying eggs under the skin of ripe fruit, from which a larva emerges, causing its collapse and reducing its commercial value. Due to the importance of this pest, monitoring its population is the starting point for any control program; however, there is no early monitoring plan within management tasks, nor are there studies on behavior, the optimization of traps, or their baits. This research proposes the evaluation of a monitoring system with encapsulated baits and adhesive traps that allow effective control.

View Article and Find Full Text PDF

Pharmacogenetics is a branch of genomic medicine aiming to personalize drug prescription guidelines based on individual genetic information. This concept might lead to a reduction in adverse drug reactions, which place a heavy burden on individual patients' health and the economy of the healthcare system. The aim of this study was to present insights gained from the pharmacogenetics-based clustering of over 500 patients from the Croatian population.

View Article and Find Full Text PDF

Background: Although higher-generation TKIs are associated with improved progression-free survival in advanced NSCLC patients with EGFR mutations, the optimal selection of TKI treatment remains uncertain. To address this gap, we developed a web application powered by a reinforcement learning (RL) algorithm to assist in guiding initial TKI treatment decisions.

Methods: Clinical and mutational data from advanced NSCLC patients were retrospectively collected from 14 medical centers.

View Article and Find Full Text PDF

Diabetes Prediction Through Linkage of Causal Discovery and Inference Model with Machine Learning Models.

Biomedicines

January 2025

Department of Management Information Systems, Keimyung University, Daegu 42601, Republic of Korea.

: Diabetes is a dangerous disease that is accompanied by various complications, including cardiovascular disease. As the global diabetes population continues to increase, it is crucial to identify its causes. Therefore, we predicted diabetes using an AI model and quantitatively examined causal relationships using a causal discovery and inference model.

View Article and Find Full Text PDF

Phytochemical and Bioactivity Evaluation of Bee Pollen and Androecia of , , and Species.

Antioxidants (Basel)

December 2024

Department of Pharmacognosy, Faculty of Pharmacy, Yeditepe University, Kayisdagi Cad., Atasehir, 34755 Istanbul, Türkiye.

Qualitative and quantitative differences in the chemical composition between bee pollen originated from (Türkiye and Slovenia), spp. (Türkiye and Slovenia), and spp. (Türkiye) and androecia of , , and (apetalous trees) were evaluated for the first time by new high-performance thin-layer chromatography (HPTLC) and ultra-performance liquid chromatography (UPLC) methods using marker compounds.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!