SAIC: an iterative clustering approach for analysis of single cell RNA-seq data.

BMC Genomics

Integrative Genomics Core, Beckman Research Institute, City of Hope, Duarte, CA, 91010, USA.

Published: October 2017

Background: Research interests toward single cell analysis have greatly increased in basic, translational and clinical research areas recently, as advances in whole-transcriptome amplification technique allow scientists to get accurate sequencing result at single cell level. An important step in the single-cell transcriptome analysis is to identify distinct cell groups that have different gene expression patterns. Currently there are limited bioinformatics approaches available for single-cell RNA-seq analysis. Many studies rely on principal component analysis (PCA) with arbitrary parameters to identify the genes that will be used to cluster the single cells.

Results: We have developed a novel algorithm, called SAIC (Single cell Analysis via Iterative Clustering), that identifies the optimal set of signature genes to separate single cells into distinct groups. Our method utilizes an iterative clustering approach to perform an exhaustive search for the best parameters within the search space, which is defined by a number of initial centers and P values. The end point is identification of a signature gene set that gives the best separation of the cell clusters. Using a simulated data set, we showed that SAIC can successfully identify the pre-defined signature gene sets that can correctly separated the cells into predefined clusters. We applied SAIC to two published single cell RNA-seq datasets. For both datasets, SAIC was able to identify a subset of signature genes that can cluster the single cells into groups that are consistent with the published results. The signature genes identified by SAIC resulted in better clusters of cells based on DB index score, and many genes also showed tissue specific expression.

Conclusions: In summary, we have developed an efficient algorithm to identify the optimal subset of genes that separate single cells into distinct clusters based on their expression patterns. We have shown that it performs better than PCA method using published single cell RNA-seq datasets.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5629617PMC
http://dx.doi.org/10.1186/s12864-017-4019-5DOI Listing

Publication Analysis

Top Keywords

single cell
24
iterative clustering
12
cell rna-seq
12
signature genes
12
single cells
12
single
10
clustering approach
8
cell
8
cell analysis
8
expression patterns
8

Similar Publications

BMT: A Cross-Validated ThinPrep Pap Cervical Cytology Dataset for Machine Learning Model Training and Validation.

Sci Data

December 2024

Department of Pathology and Laboratory Medicine, Alpert Medical School, Brown University, Providence, RI, 02912, USA.

In the past several years, a few cervical Pap smear datasets have been published for use in clinical training. However, most publicly available datasets consist of pre-segmented single cell images, contain on-image annotations that must be manually edited out, or are prepared using the conventional Pap smear method. Multicellular liquid Pap image datasets are a more accurate reflection of current cervical screening techniques.

View Article and Find Full Text PDF

scRNA + BCR-seq identifies proportions and characteristics of dual BCR B cells in the peritoneal cavity of mice and peripheral blood of healthy human donors across different ages.

Immun Ageing

December 2024

Department of Immunology, Center of Immuno-molecular Engineering, Innovation & Practice Base for Graduate Students Education, Zunyi Medical University, Zunyi, China.

The increased incidence of inflammatory diseases, infectious diseases, autoimmune disorders, and tumors in elderly individuals is closely associated with several well-established features of immunosenescence, including reduced B cell genesis and dampened immune responses. Recent studies have highlighted the critical role of dual receptor lymphocytes in tumors and autoimmune diseases. This study utilized shared data generated through scRNA-seq + scBCR-seq technology to investigate the presence of dual receptor-expressing B cells in the peritoneum of mouse and peripheral blood of healthy volunteers, and whether there are age-related differences in dual receptor B cell populations.

View Article and Find Full Text PDF

Cancer-associated fibroblasts (CAFs) exert multiple tumor-promoting functions and are key contributors to drug resistance. The mechanisms by which specific subsets of CAFs facilitate oxaliplatin resistance in colorectal cancer (CRC) have not been fully explored. This study found that THBS2 is positively associated with CAF activation, epithelial-mesenchymal transition (EMT), and chemoresistance at the pan-cancer level.

View Article and Find Full Text PDF

Towards a histological diagnosis of childhood small vessel CNS vasculitis.

Pediatr Rheumatol Online J

December 2024

Section of Rheumatology, Department of Pediatrics, Alberta Children's Hospital, University of Calgary, Calgary, Canada.

Background: Primary small vessel CNS vasculitis (sv-cPACNS) is a challenging inflammatory brain disease in children. Brain biopsy is mandatory to confirm the diagnosis. This study aims to develop and validate a histological scoring tool for diagnosing small vessel CNS vasculitis.

View Article and Find Full Text PDF

Background: There are no studies belong NOTCH2 gene polymorphism in relation to reproductive and productive traits in Holstein cattle. The objective of the present study was to investigate the effect of NOTCH2 gene polymorphisms on productive and reproductive performance of fertile and anestrum cattle.

Methods: The cattle were classified into anestrus for 3-12 months postpartum (n = 115, 37.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!