Analysis of gene expression data by clustering and visualizing played a central role in obtaining biological knowledge. Here, we used Pearson's correlation coefficient of multiple-cumulative probabilities (PCC-MCP) of genes to define the similarity of gene expression behaviors. To answer the challenge of the high-dimensional MCPs, we used icc-cluster, a clustering algorithm that obtained solutions by iterating clustering centers, with PCC-MCP to group genes. We then used -statistic stochastic neighbor embedding (t-SNE) of KC-data to generate optimal maps for clusters of MCP (t-SNE-MCP-O maps). From the analysis of several transcriptome data sets, we demonstrated clear advantages for using icc-cluster with PCC-MCP over commonly used clustering methods. t-SNE-MCP-O was also shown to give clearly projecting boundaries for clusters of PCC-MCP, which made the relationships between clusters easy to visualize and understand.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5715267 | PMC |
http://dx.doi.org/10.1002/2211-5463.12327 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!