With the emergence of single-cell RNA sequencing (scRNA-seq) technology, scientists are able to examine gene expression at single-cell resolution. Analysis of scRNA-seq data has its own challenges, which stem from its high dimensionality. The method of machine learning comes with the potential of gene (feature) selection from the high-dimensional scRNA-seq data. Even though there exist multiple machine learning methods that appear to be suitable for feature selection, such as penalized regression, there is no rigorous comparison of their performances across data sets, where each poses its own challenges. Therefore, in this paper, we analyzed and compared multiple penalized regression methods for scRNA-seq data. Given the scRNA-seq data sets we analyzed, the results show that sparse group lasso (SGL) outperforms the other six methods (ridge, lasso, elastic net, drop lasso, group lasso, and big lasso) using the metrics area under the receiver operating curve (AUC) and computation time. Building on these findings, we proposed a new algorithm for feature selection using penalized regression methods. The proposed algorithm works by selecting a small subset of genes and applying SGL to select the differentially expressed genes in scRNA-seq data. By using hierarchical clustering to group genes, the proposed method bypasses the need for domain-specific knowledge for gene grouping information. In addition, the proposed algorithm provided consistently better AUC for the data sets used.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9598401PMC
http://dx.doi.org/10.3390/biology11101495DOI Listing

Publication Analysis

Top Keywords

scrna-seq data
20
feature selection
16
penalized regression
16
selection penalized
12
data sets
12
proposed algorithm
12
algorithm feature
8
single-cell rna
8
rna sequencing
8
data
8

Similar Publications

Introduction: Macrophages exhibit marked phenotypic heterogeneity within and across disease states, with lipid metabolic reprogramming contributing to macrophage activation and heterogeneity. Chronic inflammation has been observed in human benign prostatic hyperplasia (BPH) tissues, however macrophage activation states and their contributions to this hyperplastic disease have not been defined. We postulated that a shift in macrophage phenotypes with increasing prostate size could involve metabolic alterations resulting in prostatic epithelial or stromal hyperplasia.

View Article and Find Full Text PDF

Attention-deficit/hyperactivity disorder (ADHD) is a highly heritable neurodevelopmental disorder, but its genetic architecture remains incompletely characterized. Rare coding variants, which can profoundly impact gene function, represent an underexplored dimension of ADHD risk. In this study, we analyzed large-scale DNA sequencing datasets from ancestrally diverse cohorts and observed significant enrichment of rare protein-truncating and deleterious missense variants in highly evolutionarily constrained genes.

View Article and Find Full Text PDF

Deciphering cell states and the cellular ecosystem to improve risk stratification in acute myeloid leukemia.

Brief Bioinform

November 2024

State Key Laboratory of Cellular Stress Biology, Xiang'an Hospital, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, No. 4221, Xiang'an South Road, Xiamen, Fujian 361102, China.

Acute myeloid leukemia (AML) demonstrates significant cellular heterogeneity in both leukemic and immune cells, providing valuable insights into clinical outcomes. Here, we constructed an AML single-cell transcriptome atlas and proposed sciNMF workflow to systematically dissect underlying cellular heterogeneity. Notably, sciNMF identified 26 leukemic and immune cell states that linked to clinical variables, mutations, and prognosis.

View Article and Find Full Text PDF

Deciphering the senescence-based tumoral heterogeneity and characteristics in pancreatic cancer: Results from parallel bulk and single-cell transcriptome data.

IUBMB Life

January 2025

Department of Hepatopancreatobiliary Surgery, Ningbo Medical Center Lihuili Hospital (The Affiliated Lihuili Hospital, Ningbo University), Ningbo, Zhejiang, People's Republic of China.

The prevalent intra- and intertumoral heterogeneity results in undesirable prognosis and therapy failure of pancreatic cancer, potentially resulting from cellular senescence. Herein, integrated analysis of bulk and single-cell RNA-seq profiling was conducted to characterize senescence-based heterogeneity in pancreatic cancer. Publicly available bulk and single-cell RNA sequencing from pancreatic cancer patients were gathered from TCGA-PAAD, PACA-AU, PACA-CA, and GSE154778 datasets.

View Article and Find Full Text PDF

Background: Ovarian cancers (OC) and cervical cancers (CC) have poor survival rates. Tumor-infiltrating lymphocytes (TILs) play a pivotal role in prognosis, but shared immune mechanisms remain elusive.

Methods: We integrated single-cell RNA sequencing (scRNA-seq) and spatial transcriptomics (ST) to explore immune regulation in OC and CC, focusing on the PI3K/AKT pathway and FLT3 as key modulators.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!