scAnno: a deconvolution strategy-based automatic cell type annotation tool for single-cell RNA-sequencing data sets.

Brief Bioinform

State Key Laboratory of Digital Medical Engineering, School of Biological Science & Medical Engineering, Southeast University, Nanjing, 210096, China.

Published: May 2023

AI Article Synopsis

  • Single-cell RNA sequencing (scRNA-seq) has greatly advanced research by revealing details about diverse and rare cell populations, making accurate data assessment essential for cell type annotation.
  • The scAnno tool was developed for automated annotation of scRNA-seq datasets, utilizing joint deconvolution strategy and logistic regression, with established reference profiles for human and mouse cell types.
  • scAnno demonstrated high accuracy in identifying cell type-specific genes and significantly outperformed other annotation tools, making it a promising application in scRNA-seq analysis.

Article Abstract

Undoubtedly, single-cell RNA sequencing (scRNA-seq) has changed the research landscape by providing insights into heterogeneous, complex and rare cell populations. Given that more such data sets will become available in the near future, their accurate assessment with compatible and robust models for cell type annotation is a prerequisite. Considering this, herein, we developed scAnno (scRNA-seq data annotation), an automated annotation tool for scRNA-seq data sets primarily based on the single-cell cluster levels, using a joint deconvolution strategy and logistic regression. We explicitly constructed a reference profile for human (30 cell types and 50 human tissues) and a reference profile for mouse (26 cell types and 50 mouse tissues) to support this novel methodology (scAnno). scAnno offers a possibility to obtain genes with high expression and specificity in a given cell type as cell type-specific genes (marker genes) by combining co-expression genes with seed genes as a core. Of importance, scAnno can accurately identify cell type-specific genes based on cell type reference expression profiles without any prior information. Particularly, in the peripheral blood mononuclear cell data set, the marker genes identified by scAnno showed cell type-specific expression, and the majority of marker genes matched exactly with those included in the CellMarker database. Besides validating the flexibility and interpretability of scAnno in identifying marker genes, we also proved its superiority in cell type annotation over other cell type annotation tools (SingleR, scPred, CHETAH and scmap-cluster) through internal validation of data sets (average annotation accuracy: 99.05%) and cross-platform data sets (average annotation accuracy: 95.56%). Taken together, we established the first novel methodology that utilizes a deconvolution strategy for automated cell typing and is capable of being a significant application in broader scRNA-seq analysis. scAnno is available at https://github.com/liuhong-jia/scAnno.

Download full-text PDF

Source
http://dx.doi.org/10.1093/bib/bbad179DOI Listing

Publication Analysis

Top Keywords

cell type
24
data sets
20
type annotation
16
marker genes
16
cell
14
cell type-specific
12
genes
9
scanno
8
annotation
8
annotation tool
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!