Motivation: Single-cell RNA sequencing (scRNA-seq) data are important for studying the laws of life at single-cell level. However, it is still challenging to obtain enough high-quality scRNA-seq data. To mitigate the limited availability of data, generative models have been proposed to computationally generate synthetic scRNA-seq data. Nevertheless, the data generated with current models are not very realistic yet, especially when we need to generate data with controlled conditions. In the meantime, diffusion models have shown their power in generating data with high fidelity, providing a new opportunity for scRNA-seq generation.
Results: In this study, we developed scDiffusion, a generative model combining the diffusion model and foundation model to generate high-quality scRNA-seq data with controlled conditions. We designed multiple classifiers to guide the diffusion process simultaneously, enabling scDiffusion to generate data under multiple condition combinations. We also proposed a new control strategy called Gradient Interpolation. This strategy allows the model to generate continuous trajectories of cell development from a given cell state. Experiments showed that scDiffusion could generate single-cell gene expression data closely resembling real scRNA-seq data. Also, scDiffusion can conditionally produce data on specific cell types including rare cell types. Furthermore, we could use the multiple-condition generation of scDiffusion to generate cell type that was out of the training data. Leveraging the Gradient Interpolation strategy, we generated a continuous developmental trajectory of mouse embryonic cells. These experiments demonstrate that scDiffusion is a powerful tool for augmenting the real scRNA-seq data and can provide insights into cell fate research.
Availability And Implementation: scDiffusion is openly available at the GitHub repository https://github.com/EperLuo/scDiffusion or Zenodo https://zenodo.org/doi/10.5281/zenodo.13268742.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11368386 | PMC |
http://dx.doi.org/10.1093/bioinformatics/btae518 | DOI Listing |
Nature
January 2025
Plant Biology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA.
Plants lack specialized and mobile immune cells. Consequently, any cell type that encounters pathogens must mount immune responses and communicate with surrounding cells for successful defence. However, the diversity, spatial organization and function of cellular immune states in pathogen-infected plants are poorly understood.
View Article and Find Full Text PDFNucleic Acids Res
January 2025
BioEngineering Program, Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia.
Cross-species single-cell RNA-seq data hold immense potential for unraveling cell type evolution and transferring knowledge between well-explored and less-studied species. However, challenges arise from interspecific genetic variation, batch effects stemming from experimental discrepancies and inherent individual biological differences. Here, we benchmarked nine data-integration methods across 20 species, encompassing 4.
View Article and Find Full Text PDFFront Genet
December 2024
Department of Nephrology, The Second Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China.
Background: IgA nephropathy (IgAN) is a leading cause of renal failure, but its pathogenesis remains unclear, complicating diagnosis and treatment. The invasive nature of renal biopsy highlights the need for non-invasive diagnostic biomarkers. Bulk RNA sequencing (RNA-seq) of urine offers a promising approach for identifying molecular changes relevant to IgAN.
View Article and Find Full Text PDFGigascience
January 2025
School of Life, Health & Chemical Sciences, The Open University, Milton Keynes, Buckinghamshire, MK7 6AA, UK.
Background: Bioinformatics is fundamental to biomedical sciences, but its mastery presents a steep learning curve for bench biologists and clinicians. Learning to code while analyzing data is difficult. The curve may be flattened by separating these two aspects and providing intermediate steps for budding bioinformaticians.
View Article and Find Full Text PDFJAMA Psychiatry
January 2025
Max Planck Institute of Psychiatry, Munich, Germany.
Importance: As an accessible part of the central nervous system, the retina provides a unique window to study pathophysiological mechanisms of brain disorders in humans. Imaging and electrophysiological studies have revealed retinal alterations across several neuropsychiatric and neurological disorders, but it remains largely unclear which specific cell types and biological mechanisms are involved.
Objective: To determine whether specific retinal cell types are affected by genomic risk for neuropsychiatric and neurological disorders and to explore the mechanisms through which genomic risk converges in these cell types.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!