A novel hierarchical selective ensemble classifier with bioinformatics application.

Artif Intell Med

School of Medicine, Western Sydney University, Sydney, Australia. Electronic address:

Published: November 2017

Selective ensemble learning is a technique that selects a subset of diverse and accurate basic models in order to generate stronger generalization ability. In this paper, we proposed a novel learning algorithm that is based on parallel optimization and hierarchical selection (PTHS). Our novel feature selection method is based on maximize the sum of relevance and distance (MSRD) for solving the problem of high dimensionality. Specifically, we have a PTHS algorithm that employs parallel optimization and candidate model pruning based on k-means and a hierarchical selection framework. We combine the prediction result of each basic model by majority voting, which employs the divide-and-conquer strategy to save computing time. In addition, the PT algorithm is capable to transform a multi-class problem into a binary classification problem, and thereby allowing our ensemble model to address multi-class problems. Empirical study shows that MSRD is efficient in solving the high dimensionality problem, and PTHS exhibits better performance than the other existing classification algorithms. Most importantly, our classifier achieved high-level performance on several bioinformatics problems (e.g. tRNA identification, and protein-protein interaction prediction, etc.), demonstrating efficiency and robustness.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.artmed.2017.02.005DOI Listing

Publication Analysis

Top Keywords

selective ensemble
8
parallel optimization
8
hierarchical selection
8
high dimensionality
8
novel hierarchical
4
hierarchical selective
4
ensemble classifier
4
classifier bioinformatics
4
bioinformatics application
4
application selective
4

Similar Publications

In the present digital scenario, the explosion of Internet of Things (IoT) devices makes massive volumes of high-dimensional data, presenting significant data and privacy security challenges. As IoT networks enlarge, certifying sensitive data privacy while still employing data analytics authority is vital. In the period of big data, statistical learning has seen fast progressions in methodological practical and innovation applications.

View Article and Find Full Text PDF

Drug-induced autoimmunity (DIA) is a non-IgE immune-related adverse drug reaction that poses substantial challenges in predictive toxicology due to its idiosyncratic nature, complex pathogenesis, and diverse clinical manifestations. To address these challenges, we developed InterDIA, an interpretable machine learning framework for predicting DIA toxicity based on molecular physicochemical properties. Multi-strategy feature selection and advanced ensemble resampling approaches were integrated to enhance prediction accuracy and overcome data imbalance.

View Article and Find Full Text PDF

Electrochemical oxidation of small molecules shows great promise to substitute oxygen evolution reaction (OER) or hydrogen oxidation reaction (HOR) to enhance reaction kinetics and reduce energy consumption, as well as produce high-valued chemicals or serve as fuels. For these oxidation reactions, high-valence metal sites generated at oxidative potentials are typically considered as active sites to trigger the oxidation process of small molecules. Isolated atom site catalysts (IASCs) have been developed as an ideal system to precisely regulate the oxidation state and coordination environment of single-metal centers, and thus optimize their catalytic property.

View Article and Find Full Text PDF

Diabetes is a growing health concern in developing countries, causing considerable mortality rates. While machine learning (ML) approaches have been widely used to improve early detection and treatment, several studies have shown low classification accuracies due to overfitting, underfitting, and data noise. This research employs parallel and sequential ensemble ML approaches paired with feature selection techniques to boost classification accuracy.

View Article and Find Full Text PDF

This study introduces a novel ensemble learning technique namely Multi-Armed Bandit Ensemble (MAB-Ensemble), designed for lane detection in road images intended for autonomous vehicles. The foundation of the proposed MAB-Ensemble technique is inspired in terms of Multi-Armed bandit optimization to facilitate efficient model selection for lane segmentation. The benchmarking dataset namely TuSimple is used for training, validating and testing the proposed and existing lane detection techniques.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!