scLEGA: an attention-based deep clustering method with a tendency for low expression of genes on single-cell RNA-seq data.

Brief Bioinform

College of Computer and Control Engineering, Northeast Forestry University 150040, 26 Hexing Road, Xiangfang District, Harbin, China.

Published: July 2024

AI Article Synopsis

  • Single-cell RNA sequencing (scRNA-seq) allows researchers to investigate the diversity of cell types in tissues, but most existing methods focus on highly variable genes, neglecting those with lower expression levels.
  • To improve cell type inference, a new method called scLEGA has been developed, which uses a specialized loss function and a combination of different clustering approaches to better account for low-expression genes.
  • ScLEGA has shown better clustering accuracy, scalability, and stability compared to 12 leading methods, making it a promising tool for analyzing scRNA-seq data.

Article Abstract

Single-cell RNA sequencing (scRNA-seq) enables the exploration of biological heterogeneity among different cell types within tissues at a resolution. Inferring cell types within tissues is foundational for downstream research. Most existing methods for cell type inference based on scRNA-seq data primarily utilize highly variable genes (HVGs) with higher expression levels as clustering features, overlooking the contribution of HVGs with lower expression levels. To address this, we have designed a novel cell type inference method for scRNA-seq data, termed scLEGA. scLEGA employs a novel zero-inflated negative binomial (ZINB) loss function that fully considers the contribution of genes with lower expression levels and combines two distinct scRNA-seq clustering strategies through a multi-head attention mechanism. It utilizes a low-expression optimized denoising autoencoder, based on the novel ZINB model, to extract low-dimensional features and handle dropout events, and a GCN-based graph autoencoder (GAE) that leverages neighbor information to guide dimensionality reduction. The iterative fusion of denoising and topological embedding in scLEGA facilitates the acquisition of cluster-friendly cell representations in the hidden embedding, where similar cells are brought closer together. Compared to 12 state-of-the-art cell type inference methods on 15 scRNA-seq datasets, scLEGA demonstrates superior performance in clustering accuracy, scalability, and stability. Our scLEGA model codes are freely available at https://github.com/Masonze/scLEGA-main.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11281828PMC
http://dx.doi.org/10.1093/bib/bbae371DOI Listing

Publication Analysis

Top Keywords

cell type
12
type inference
12
expression levels
12
cell types
8
types tissues
8
scrna-seq data
8
lower expression
8
sclega
6
cell
6
scrna-seq
5

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!