scNCL: transferring labels from scRNA-seq to scATAC-seq data with neighborhood contrastive regularization.

Bioinformatics

School of Computer Science and Engineering, Central South University, Changsha 410083, China.

Published: August 2023

Motivation: scATAC-seq has enabled chromatin accessibility landscape profiling at the single-cell level, providing opportunities for determining cell-type-specific regulation codes. However, high dimension, extreme sparsity, and large scale of scATAC-seq data have posed great challenges to cell-type identification. Thus, there has been a growing interest in leveraging the well-annotated scRNA-seq data to help annotate scATAC-seq data. However, substantial computational obstacles remain to transfer information from scRNA-seq to scATAC-seq, especially for their heterogeneous features.

Results: We propose a new transfer learning method, scNCL, which utilizes prior knowledge and contrastive learning to tackle the problem of heterogeneous features. Briefly, scNCL transforms scATAC-seq features into gene activity matrix based on prior knowledge. Since feature transformation can cause information loss, scNCL introduces neighborhood contrastive learning to preserve the neighborhood structure of scATAC-seq cells in raw feature space. To learn transferable latent features, scNCL uses a feature projection loss and an alignment loss to harmonize embeddings between scRNA-seq and scATAC-seq. Experiments on various datasets demonstrated that scNCL not only realizes accurate and robust label transfer for common types, but also achieves reliable detection of novel types. scNCL is also computationally efficient and scalable to million-scale datasets. Moreover, we prove scNCL can help refine cell-type annotations in existing scATAC-seq atlases.

Availability And Implementation: The source code and data used in this paper can be found in https://github.com/CSUBioGroup/scNCL-release.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10457667PMC
http://dx.doi.org/10.1093/bioinformatics/btad505DOI Listing

Publication Analysis

Top Keywords

scrna-seq scatac-seq
12
scatac-seq data
12
scatac-seq
9
scncl
8
neighborhood contrastive
8
prior knowledge
8
contrastive learning
8
features scncl
8
data
5
scncl transferring
4

Similar Publications

Total proctocolectomy with ileal pouch anal anastomosis is the standard of care for patients with severe ulcerative colitis. We generated a cell-type-resolved transcriptional and epigenetic atlas of ileal pouches using scRNA-seq and scATAC-seq data from paired biopsy samples of the ileal pouch and the ileal segment above the pouch (pre-pouch) from patients (male=4, female=2), and paired biopsies of the terminal ileum and ascending colon from healthy individuals (male=3, female=3) serving as reference. Our study finds an additional population of absorptive and secretory epithelial cells within the pouch but not the pre-pouch.

View Article and Find Full Text PDF

The rapid advance of large-scale atlas-level single cell RNA sequences and single-cell chromatin accessibility data provide extraordinary avenues to broad and deep insight into complex biological mechanism. Leveraging the datasets and transfering labels from scRNA-seq to scATAC-seq will empower the exploration of single-cell omics data. However, the current label transfer methods have limited performance, largely due to the lower capable of preserving fine-grained cell populations and intrinsic or extrinsic heterogeneity between datasets.

View Article and Find Full Text PDF

Modular organization of enhancer network provides transcriptional robustness in mammalian development.

Nucleic Acids Res

January 2025

State Key Laboratory of Cellular Stress Biology, Xiang'an Hospital, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, No. 4221, Xiang'an South Road, Xiamen, Fujian 361102, China.

Enhancer clusters, pivotal in mammalian development and diseases, can organize as enhancer networks to control cell identity and disease genes; however, the underlying mechanism remains largely unexplored. Here, we introduce eNet 2.0, a comprehensive tool for enhancer networks analysis during development and diseases based on single-cell chromatin accessibility data.

View Article and Find Full Text PDF

Single-cell multi-omics techniques, which enable the simultaneous measurement of multiple modalities such as RNA gene expression and Assay for Transposase-Accessible Chromatin (ATAC) within individual cells, have become a powerful tool for deciphering the intricate complexity of cellular systems. Most current methods rely on motif databases to establish cross-modality relationships between genes from RNA-seq data and peaks from ATAC-seq data. However, these approaches are constrained by incomplete database coverage, particularly for novel or poorly characterized relationships.

View Article and Find Full Text PDF

Integrating Single-Cell RNA-Seq and ATAC-Seq Analysis Reveals Uterine Cell Heterogeneity and Regulatory Networks Linked to Pimpled Eggs in Chickens.

Int J Mol Sci

December 2024

Shandong Provincial Key Laboratory for Livestock Germplasm Innovation & Utilization, College of Animal Science and Technology, Shandong Agricultural University, 61 Daizong Street, Taian 271018, China.

Pimpled eggs have defective shells, which severely impacts hatching rates and transportation safety. In this study, we constructed single-cell resolution transcriptomic and chromatin accessibility maps from uterine tissues of chickens using single-cell RNA sequencing (scRNA-seq) and single-cell ATAC sequencing (scATAC-seq). We identified 11 major cell types and characterized their marker genes, along with specific transcription factors (TFs) that determine cell fate.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!