Single cell ATAC-seq (scATAC-seq) enables the mapping of regulatory elements in fine-grained cell types. Despite this advance, analysis of the resulting data is challenging, and large scale scATAC-seq data are difficult to obtain and expensive to generate. This motivates a method to leverage information from previously generated large scale scATAC-seq or scRNA-seq data to guide our analysis of new scATAC-seq datasets. We analyze scATAC-seq data using latent Dirichlet allocation (LDA), a Bayesian algorithm that was developed to model text corpora, summarizing documents as mixtures of topics defined based on the words that distinguish the documents. When applied to scATAC-seq, LDA treats cells as documents and their accessible sites as words, identifying "topics" based on the cell type-specific accessible sites in those cells. Previous work used uniform symmetric priors in LDA, but we hypothesized that nonuniform matrix priors generated from LDA models trained on existing data sets may enable improved detection of cell types in new data sets, especially if they have relatively few cells. In this work, we test this hypothesis in scATAC-seq data from whole C. elegans nematodes and SHARE-seq data from mouse skin cells. We show that nonsymmetric matrix priors for LDA improve our ability to capture cell type information from small scATAC-seq datasets.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10191269 | PMC |
http://dx.doi.org/10.1371/journal.pcbi.1011049 | DOI Listing |
Sci Rep
January 2025
MRC WIMM Centre for Computational Biology, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, OX3 9DS, UK.
Bulk ATAC-seq assays have been used to map and profile the chromatin accessibility of regulatory elements such as enhancers, promoters, and insulators. This has provided great insight into the regulation of gene expression in many cell types in a variety of organisms. To date, ATAC-seq has most often been used to provide an average evaluation of chromatin accessibility in populations of cells.
View Article and Find Full Text PDFSci Data
January 2025
BGI Research, Shenzhen, 518083, China.
The mammalian nervous system controls complex functions through highly specialized and interacting structures. Single-cell sequencing can provide information on cell-type-specific chromatin structure and regulatory elements, revealing differences in chromatin organization between different cell types and their potential roles of these differences in brain function. Here, we generated a chromatin accessibility dataset through single-cell ATAC-seq of 174,593 high-quality nuclei from 16 adult rat brain regions.
View Article and Find Full Text PDFNat Commun
January 2025
University of Chicago, Department of Medicine, Chicago, IL, USA.
Total proctocolectomy with ileal pouch anal anastomosis is the standard of care for patients with severe ulcerative colitis. We generated a cell-type-resolved transcriptional and epigenetic atlas of ileal pouches using scRNA-seq and scATAC-seq data from paired biopsy samples of the ileal pouch and the ileal segment above the pouch (pre-pouch) from patients (male=4, female=2), and paired biopsies of the terminal ileum and ascending colon from healthy individuals (male=3, female=3) serving as reference. Our study finds an additional population of absorptive and secretory epithelial cells within the pouch but not the pre-pouch.
View Article and Find Full Text PDFPLoS Comput Biol
January 2025
School of Mathematics/Harbin Institute of Technology, Harbin, China.
The rapid advance of large-scale atlas-level single cell RNA sequences and single-cell chromatin accessibility data provide extraordinary avenues to broad and deep insight into complex biological mechanism. Leveraging the datasets and transfering labels from scRNA-seq to scATAC-seq will empower the exploration of single-cell omics data. However, the current label transfer methods have limited performance, largely due to the lower capable of preserving fine-grained cell populations and intrinsic or extrinsic heterogeneity between datasets.
View Article and Find Full Text PDFNucleic Acids Res
January 2025
State Key Laboratory of Cellular Stress Biology, Xiang'an Hospital, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, No. 4221, Xiang'an South Road, Xiamen, Fujian 361102, China.
Enhancer clusters, pivotal in mammalian development and diseases, can organize as enhancer networks to control cell identity and disease genes; however, the underlying mechanism remains largely unexplored. Here, we introduce eNet 2.0, a comprehensive tool for enhancer networks analysis during development and diseases based on single-cell chromatin accessibility data.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!