Analysis of gene expression data by clustering and visualizing played a central role in obtaining biological knowledge. Here, we used Pearson's correlation coefficient of multiple-cumulative probabilities (PCC-MCP) of genes to define the similarity of gene expression behaviors. To answer the challenge of the high-dimensional MCPs, we used icc-cluster, a clustering algorithm that obtained solutions by iterating clustering centers, with PCC-MCP to group genes. We then used -statistic stochastic neighbor embedding (t-SNE) of KC-data to generate optimal maps for clusters of MCP (t-SNE-MCP-O maps). From the analysis of several transcriptome data sets, we demonstrated clear advantages for using icc-cluster with PCC-MCP over commonly used clustering methods. t-SNE-MCP-O was also shown to give clearly projecting boundaries for clusters of PCC-MCP, which made the relationships between clusters easy to visualize and understand.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5715267PMC
http://dx.doi.org/10.1002/2211-5463.12327DOI Listing

Publication Analysis

Top Keywords

multiple-cumulative probabilities
8
gene expression
8
probabilities cluster
4
cluster visualize
4
visualize transcriptomes
4
transcriptomes analysis
4
analysis gene
4
expression data
4
clustering
4
data clustering
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!