Single-cell analysis across multiple samples and conditions requires quantitative modeling of the interplay between the continuum of cell states and the technical and biological sources of sample-to-sample variability. We introduce GEDI, a generative model that identifies latent space variations in multi-sample, multi-condition single-cell datasets and attributes them to sample-level covariates. GEDI enables cross-sample cell state mapping on par with state-of-the-art integration methods, cluster-free differential gene expression analysis along the continuum of cell states, and machine learning-based prediction of sample characteristics from single-cell data. GEDI can also incorporate gene-level prior knowledge to infer pathway and regulatory network activities in single cells. Finally, GEDI extends all these concepts to previously unexplored modalities that require joint consideration of dual measurements, such as the joint analysis of exon inclusion/exclusion reads to model alternative cassette exon splicing, or spliced/unspliced reads to model the mRNA stability landscapes of single cells.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11298001 | PMC |
http://dx.doi.org/10.1038/s41467-024-50963-0 | DOI Listing |
BMC Bioinformatics
October 2024
Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.
In the past two decades, genomics has advanced significantly, with single-cell RNA-sequencing (scRNA-seq) marking a pivotal milestone. ScRNA-seq provides unparalleled insights into cellular diversity and has spurred diverse studies across multiple conditions and samples, resulting in an influx of complex multidimensional genomics data. This highlights the need for robust methodologies capable of handling the complexity and multidimensionality of such genomics data.
View Article and Find Full Text PDFNat Commun
August 2024
Department of Human Genetics, McGill University, Montreal, QC, H3A 0C7, Canada.
Single-cell analysis across multiple samples and conditions requires quantitative modeling of the interplay between the continuum of cell states and the technical and biological sources of sample-to-sample variability. We introduce GEDI, a generative model that identifies latent space variations in multi-sample, multi-condition single-cell datasets and attributes them to sample-level covariates. GEDI enables cross-sample cell state mapping on par with state-of-the-art integration methods, cluster-free differential gene expression analysis along the continuum of cell states, and machine learning-based prediction of sample characteristics from single-cell data.
View Article and Find Full Text PDFbioRxiv
December 2023
Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, Georgia 30322, United States of America.
Research on cell-cell communication (CCC) is crucial for understanding biology and diseases. Many existing CCC inference tools neglect potential confounders, such as batch and demographic variables, when analyzing multi-sample, multi-condition scRNA-seq datasets. To address this significant gap, we introduce STACCato, a upervised ensor nalysis tool for studying ell-cell ommunication, that identifies CCC events and estimates the effects of biological conditions (e.
View Article and Find Full Text PDFNat Commun
July 2023
Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, Australia.
The recent emergence of multi-sample multi-condition single-cell multi-cohort studies allows researchers to investigate different cell states. The effective integration of multiple large-cohort studies promises biological insights into cells under different conditions that individual studies cannot provide. Here, we present scMerge2, a scalable algorithm that allows data integration of atlas-scale multi-sample multi-condition single-cell studies.
View Article and Find Full Text PDFBMC Genomics
May 2023
Research Data Sciences, Translational Biology, Biogen Inc., Cambridge, MA, 02142, USA.
Background: Single-cell RNA sequencing is a state-of-the-art technology to understand gene expression in complex tissues. With the growing amount of data being generated, the standardization and automation of data analysis are critical to generating hypotheses and discovering biological insights.
Results: Here, we present scRNASequest, a semi-automated single-cell RNA-seq (scRNA-seq) data analysis workflow which allows (1) preprocessing from raw UMI count data, (2) harmonization by one or multiple methods, (3) reference-dataset-based cell type label transfer and embedding projection, (4) multi-sample, multi-condition single-cell level differential gene expression analysis, and (5) seamless integration with cellxgene VIP for visualization and with CellDepot for data hosting and sharing by generating compatible h5ad files.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!