Extant sequential wrapper-based feature subset selection (FSS) algorithms are not scalable and yield poor performance when applied to big datasets. Hence, to circumvent these challenges, we propose parallel and distributed hybrid evolutionary algorithms (EAs) based wrappers under Apache Spark. We propose two hybrid EAs based on the Binary Differential Evolution (BDE), and Binary Threshold Accepting (BTA), namely, (i) Parallel Binary Differential Evolution and Threshold Accepting (PB-DETA), where BDE and BTA work in tandem in every iteration, and (ii) its ablation variant, Parallel Binary Threshold Accepting and Differential Evolution (PB-TADE). Here, BTA is invoked to enhance the search capability and avoid premature convergence of BDE. For comparison purposes, we also parallelized two state-of-the-art algorithms: adaptive DE (ADE) and permutation based DE (DE-FS), and named them PB-ADE and P-DE-FS respectively. Throughout, logistic regression (LR) is employed to compute the fitness function, namely, area under the receiver operator characteristic curve (AUC). The effectiveness of the proposed algorithms is tested over the five big datasets of varying dimensions. It is noteworthy that the PB-TADE turned out to be statistically significant than the rest. All the algorithms have shown the repeatability property. The proposed parallel model attained a speedup of 2.2-2.9. We also reported feature subset with high AUC and least cardinality.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9463682PMC
http://dx.doi.org/10.1007/s10586-022-03725-wDOI Listing

Publication Analysis

Top Keywords

feature subset
12
differential evolution
12
threshold accepting
12
subset selection
8
hybrid evolutionary
8
apache spark
8
big datasets
8
eas based
8
binary differential
8
binary threshold
8

Similar Publications

Visual diagnosis is one of the key features of squamous cell carcinoma of the oral cavity (OSCC) and oropharynx (OPSCC), both subsets of head and neck squamous cell carcinoma (HNSCC) with a heterogeneous clinical appearance. Advancements in artificial intelligence led to Image recognition being introduced recently into large language models (LLMs) such as ChatGPT 4.0.

View Article and Find Full Text PDF

Type 1 diabetes (T1D) is a chronic autoimmune disease characterized by the loss of insulin-producing cells in the pancreatic islets. Patients with T1D have autoreactive CD4 and CD8 T cells that show specific features, indicating previous exposure to self-antigens. Despite that memory T cells are vital components of the adaptive immune system, providing enduring protection against pathogens; individuals with T1D have a higher proportion of memory T cells compared to healthy individuals with naїve phenotypes.

View Article and Find Full Text PDF

AiGPro: a multi-tasks model for profiling of GPCRs for agonist and antagonist.

J Cheminform

January 2025

School of Systems Biomedical Science, Soongsil University, 369 Sangdo-ro, Dongjak-gu, 06978, Seoul, Republic of Korea.

G protein-coupled receptors (GPCRs) play vital roles in various physiological processes, making them attractive drug discovery targets. Meanwhile, deep learning techniques have revolutionized drug discovery by facilitating efficient tools for expediting the identification and optimization of ligands. However, existing models for the GPCRs often focus on single-target or a small subset of GPCRs or employ binary classification, constraining their applicability for high throughput virtual screening.

View Article and Find Full Text PDF

Background: Hepatitis B is a liver infection caused by HBV. Infected individuals who fail to control the viral infection develop chronic hepatitis B and are at risk of developing life-threatening liver diseases, such as cirrhosis or liver cancer. Dendritic cells (DCs) play important roles in the immune response against HBV but are functionally impaired in patients with chronic hepatitis B.

View Article and Find Full Text PDF

Background: High-grade serous ovarian cancer (HGSOC) remains one of the most challenging gynecological malignancies, with over 70% of ovarian cancer patients ultimately experiencing disease progression. The current prognostic tools for progression-free survival (PFS) in HGSOC patients have limitations. This study aims to develop an explainable machine learning (ML) model for predicting PFS in HGSOC patients.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!