MultiSC: a deep learning pipeline for analyzing multiomics single-cell data.

Brief Bioinform

Department of Quantitative Health Sciences, Mayo Clinic, 13400 E Shea Blvd, Scottsdale, AZ 85259, United States.

Published: September 2024

AI Article Synopsis

  • Single-cell technologies allow for detailed examination of individual cell functions and behaviors through advanced multi-omics sequencing techniques.
  • NEAT-seq serves as an example, providing gene expression, chromatin accessibility, and protein expression data for each cell, but lacks effective tools for integrating these data types.
  • The proposed MultiSC pipeline aims to bridge this gap by using advanced models for clustering and predicting gene regulation, enabling comprehensive analysis of multi-omics single-cell data and enhancing insights into cellular processes.

Article Abstract

Single-cell technologies enable researchers to investigate cell functions at an individual cell level and study cellular processes with higher resolution. Several multi-omics single-cell sequencing techniques have been developed to explore various aspects of cellular behavior. Using NEAT-seq as an example, this method simultaneously obtains three kinds of omics data for each cell: gene expression, chromatin accessibility, and protein expression of transcription factors (TFs). Consequently, NEAT-seq offers a more comprehensive understanding of cellular activities in multiple modalities. However, there is a lack of tools available for effectively integrating the three types of omics data. To address this gap, we propose a novel pipeline called MultiSC for the analysis of MULTIomic Single-Cell data. Our pipeline leverages a multimodal constraint autoencoder (single-cell hierarchical constraint autoencoder) to integrate the multi-omics data during the clustering process and a matrix factorization-based model (scMF) to predict target genes regulated by a TF. Moreover, we utilize multivariate linear regression models to predict gene regulatory networks from the multi-omics data. Additional functionalities, including differential expression, mediation analysis, and causal inference, are also incorporated into the MultiSC pipeline. Extensive experiments were conducted to evaluate the performance of MultiSC. The results demonstrate that our pipeline enables researchers to gain a comprehensive view of cell activities and gene regulatory networks by fully leveraging the potential of multiomics single-cell data. By employing MultiSC, researchers can effectively integrate and analyze diverse omics data types, enhancing their understanding of cellular processes.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11458747PMC
http://dx.doi.org/10.1093/bib/bbae492DOI Listing

Publication Analysis

Top Keywords

single-cell data
12
omics data
12
multiomics single-cell
8
data
8
cellular processes
8
understanding cellular
8
constraint autoencoder
8
multi-omics data
8
gene regulatory
8
regulatory networks
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!