A novel feature selection algorithm for identifying hub genes in lung cancer.

Sci Rep

Department of Information Systems, College of Computer and Information Sciences, Jouf University, 72388, Sakaka, Saudi Arabia.

Published: December 2023

AI Article Synopsis

  • Lung cancer significantly impacts global mortality, necessitating precise biomarker identification for effective diagnosis and treatment.
  • The study presents the Voting-Based Enhanced Binary Ebola Optimization Search Algorithm (VBEOSA), which integrates binary and Ebola optimization techniques to enhance feature selection in lung cancer research.
  • Through the analysis of gene expression datasets, the research identifies ten key hub genes and highlights important biological pathways, contributing to a deeper understanding of lung cancer’s molecular mechanisms and potential improvements in diagnostic methods.

Article Abstract

Lung cancer, a life-threatening disease primarily affecting lung tissue, remains a significant contributor to mortality in both developed and developing nations. Accurate biomarker identification is imperative for effective cancer diagnosis and therapeutic strategies. This study introduces the Voting-Based Enhanced Binary Ebola Optimization Search Algorithm (VBEOSA), an innovative ensemble-based approach combining binary optimization and the Ebola optimization search algorithm. VBEOSA harnesses the collective power of the state-of-the-art classification models through soft voting. Moreover, our research applies VBEOSA to an extensive lung cancer gene expression dataset obtained from TCGA, following essential preprocessing steps including outlier detection and removal, data normalization, and filtration. VBEOSA aids in feature selection, leading to the discovery of key hub genes closely associated with lung cancer, validated through comprehensive protein-protein interaction analysis. Notably, our investigation reveals ten significant hub genes-ADRB2, ACTB, ARRB2, GNGT2, ADRB1, ACTG1, ACACA, ATP5A1, ADCY9, and ADRA1B-each demonstrating substantial involvement in the domain of lung cancer. Furthermore, our pathway analysis sheds light on the prominence of strategic pathways such as salivary secretion and the calcium signaling pathway, providing invaluable insights into the intricate molecular mechanisms underpinning lung cancer. We also utilize the weighted gene co-expression network analysis (WGCNA) method to identify gene modules exhibiting strong correlations with clinical attributes associated with lung cancer. Our findings underscore the efficacy of VBEOSA in feature selection and offer profound insights into the multifaceted molecular landscape of lung cancer. Finally, we are confident that this research has the potential to improve diagnostic capabilities and further enrich our understanding of the disease, thus setting the stage for future advancements in the clinical management of lung cancer. The VBEOSA source codes is publicly available at https://github.com/TEHNAN/VBEOSA-A-Novel-Feature-Selection-Algorithm-for-Identifying-hub-Genes-in-Lung-Cancer .

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10709567PMC
http://dx.doi.org/10.1038/s41598-023-48953-1DOI Listing

Publication Analysis

Top Keywords

lung cancer
36
feature selection
12
lung
10
cancer
10
hub genes
8
ebola optimization
8
optimization search
8
search algorithm
8
algorithm vbeosa
8
associated lung
8

Similar Publications

Background: Marathon training and running have many beneficial effects on human health and physical fitness; however, they also pose risks. To date, no comprehensive review regarding both the benefits and risks of marathon running on different organ systems has been published.

Main Body: The aim of this review was to provide a comprehensive review of the benefits and risks of marathon training and racing on different organ systems.

View Article and Find Full Text PDF

While the effect of amplification-induced oncogene expression in cancer is known, the impact of copy-number gains on "bystander" genes is less understood. We create a comprehensive map of dosage compensation in cancer by integrating expression and copy number profiles from over 8000 tumors in The Cancer Genome Atlas and cell lines from the Cancer Cell Line Encyclopedia. Additionally, we analyze 17 cancer open reading frame screens to identify genes toxic to cancer cells when overexpressed.

View Article and Find Full Text PDF

TP53 mutations are recognized to correlate with a worse prognosis in individuals with non-small cell lung cancer (NSCLC). There exists an immediate necessity to pinpoint selective treatment for patients carrying TP53 mutations. Potential drugs were identified by comparing drug sensitivity differences, represented by the half-maximal inhibitory concentration (IC50), between TP53 mutant and wild-type NSCLC cell lines using database analysis.

View Article and Find Full Text PDF

Recent barcoding technologies allow reconstructing lineage trees while capturing paired single-cell RNA-sequencing (scRNA-seq) data. Such datasets provide opportunities to compare gene expression memory maintenance through lineage branching and pinpoint critical genes in these processes. Here we develop Permutation, Optimization, and Representation learning based single Cell gene Expression and Lineage ANalysis (PORCELAN) to identify lineage-informative genes or subtrees where lineage and expression are tightly coupled.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!