scBoolSeq: Linking scRNA-seq statistics and Boolean dynamics.

PLoS Comput Biol

Univ. Bordeaux, CNRS, Bordeaux INP, LaBRI, UMR 5800, Talence, France.

Published: July 2024

Boolean networks are largely employed to model the qualitative dynamics of cell fate processes by describing the change of binary activation states of genes and transcription factors with time. Being able to bridge such qualitative states with quantitative measurements of gene expression in cells, as scRNA-seq, is a cornerstone for data-driven model construction and validation. On one hand, scRNA-seq binarisation is a key step for inferring and validating Boolean models. On the other hand, the generation of synthetic scRNA-seq data from baseline Boolean models provides an important asset to benchmark inference methods. However, linking characteristics of scRNA-seq datasets, including dropout events, with Boolean states is a challenging task. We present scBoolSeq, a method for the bidirectional linking of scRNA-seq data and Boolean activation state of genes. Given a reference scRNA-seq dataset, scBoolSeq computes statistical criteria to classify the empirical gene pseudocount distributions as either unimodal, bimodal, or zero-inflated, and fit a probabilistic model of dropouts, with gene-dependent parameters. From these learnt distributions, scBoolSeq can perform both binarisation of scRNA-seq datasets, and generate synthetic scRNA-seq datasets from Boolean traces, as issued from Boolean networks, using biased sampling and dropout simulation. We present a case study demonstrating the application of scBoolSeq's binarisation scheme in data-driven model inference. Furthermore, we compare synthetic scRNA-seq data generated by scBoolSeq with BoolODE's, data for the same Boolean Network model. The comparison shows that our method better reproduces the statistics of real scRNA-seq datasets, such as the mean-variance and mean-dropout relationships while exhibiting clearly defined trajectories in two-dimensional projections of the data.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11257695PMC
http://dx.doi.org/10.1371/journal.pcbi.1011620DOI Listing

Publication Analysis

Top Keywords

scrna-seq datasets
16
synthetic scrna-seq
12
scrna-seq data
12
scrna-seq
11
boolean
9
linking scrna-seq
8
boolean networks
8
data-driven model
8
boolean models
8
data boolean
8

Similar Publications

The advent of single-cell RNA sequencing (scRNA-seq) has greatly enhanced our ability to explore cellular heterogeneity with high resolution. Identifying subpopulations of cells and their associated molecular markers is crucial in understanding their distinct roles in tissues. To address the challenges in marker gene selection, we introduce CORTADO, a computational framework based on hill-climbing optimization for the efficient discovery of cell-type-specific markers.

View Article and Find Full Text PDF

Unlabelled: The ECM is a complex and dynamic meshwork of proteins that forms the framework of all multicellular organisms. Protein interactions within the ECM are critical to building and remodeling the ECM meshwork, while interactions between ECM proteins and cell surface receptors are essential for the initiation of signal transduction and the orchestration of cellular behaviors. Here, we report the development of MatriCom, a web application ( https://matrinet.

View Article and Find Full Text PDF

Typical high-throughput single-cell RNA-sequencing (scRNA-seq) analyses are primarily conducted by (pseudo)alignment, through the lens of annotated gene models, and aimed at detecting differential gene expression. This misses diversity generated by other mechanisms that diversify the transcriptome such as splicing and V(D)J recombination, and is blind to sequences missing from imperfect reference genomes. Here, we present sc-SPLASH, a highly efficient pipeline that extends our SPLASH framework for statistics-first, reference-free discovery to barcoded scRNA-seq (10x Chromium) and spatial transcriptomics (10x Visium); we also provide its optimized module for preprocessing and -mer counting in barcoded data, BKC, as a standalone tool.

View Article and Find Full Text PDF

Comprehensive analysis of scRNA-seq and bulk RNA-seq reveals the non-cardiomyocytes heterogeneity and novel cell populations in dilated cardiomyopathy.

J Transl Med

January 2025

State Key Laboratory of Cardiovascular Diseases and Medical Innovation Center, School of Medicine, Shanghai East Hospital, Tongji University, Shanghai, 200120, China.

Background: Dilated cardiomyopathy (DCM) is one of the most common causes of heart failure. Infiltration and alterations in non-cardiomyocytes of the human heart involve crucially in the occurrence of DCM and associated immunotherapeutic approaches.

Methods: We constructed a single-cell transcriptional atlas of DCM and normal patients.

View Article and Find Full Text PDF

Background: The progression and severity of periodontitis (PD) are associated with the release of extracellular vesicles by periodontal tissue cells. However, the precise mechanisms through which exosome-related genes (ERGs) influence PD remain unclear. This study aimed to investigate the role and potential mechanisms of key exosome-related genes in PD using transcriptome profiling at the single-cell level.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!