There are many algorithms to cluster sample data points based on nearness or a similarity measure. Often the implication is that points in different clusters come from different underlying classes, whereas those in the same cluster come from the same class. Stochastically, the underlying classes represent different random processes. The inference is that clusters represent a partition of the sample points according to which process they belong. This paper discusses a model-based clustering toolbox that evaluates cluster accuracy. Each random process is modeled as its mean plus independent noise, sample points are generated, the points are clustered, and the clustering error is the number of points clustered incorrectly according to the generating random processes. Various clustering algorithms are evaluated based on process variance and the key issue of the rate at which algorithmic performance improves with increasing numbers of experimental replications. The model means can be selected by hand to test the separability of expected types of biological expression patterns. Alternatively, the model can be seeded by real data to test the expected precision of that output or the extent of improvement in precision that replication could provide. In the latter case, a clustering algorithm is used to form clusters, and the model is seeded with the means and variances of these clusters. Other algorithms are then tested relative to the seeding algorithm. Results are averaged over various seeds. Output includes error tables and graphs, confusion matrices, principal-component plots, and validation measures. Five algorithms are studied in detail: K-means, fuzzy C-means, self-organizing maps, hierarchical Euclidean-distance-based and correlation-based clustering. The toolbox is applied to gene-expression clustering based on cDNA microarrays using real data. Expression profile graphics are generated and error analysis is displayed within the context of these profile graphics. A large amount of generated output is available over the web.

Download full-text PDF

Source
http://dx.doi.org/10.1089/10665270252833217DOI Listing

Publication Analysis

Top Keywords

underlying classes
8
random processes
8
sample points
8
clustering toolbox
8
points clustered
8
model seeded
8
real data
8
profile graphics
8
points
6
clustering
6

Similar Publications

Mechanisms of Low MHC I Expression and Strategies for Targeting MHC I with Small Molecules in Cancer Immunotherapy.

Cancer Lett

December 2024

Zhejiang Province Key Laboratory of Anti-Cancer Drug Research, Institute of Pharmacology and Toxicology, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China; School of Medicine, Hangzhou City University, Hangzhou, Zhejiang, 310015, China; The Innovation Institute for Artificial Intelligence in Medicine, Zhejiang University, Hangzhou 310018, China. Electronic address:

Major histocompatibility complex (MHC) class I load antigens and present them on the cell surface, which transduces the tumor-associated antigens to CD8 T cells, activating the acquired immune system. However, many tumors downregulate MHC I expression to evade immune surveillance. The low expression of MHC I not only reduce recognition by- and cytotoxicity of CD8 T cells, but also seriously weakens the anti-tumor effect of immunotherapy by restoring CD8 T cells, such as immune checkpoint inhibitors (ICIs).

View Article and Find Full Text PDF

Molecular mechanisms behind the inhibitory effects of ginsenoside Rg3 on hepatic fibrosis: a review.

Arch Toxicol

December 2024

College of Chinese Medicinal Materials, Jilin Provincial International Joint Research Center for the Development and Utilization of Authentic Medicinal Materials, Jilin Agricultural University, Changchun, 130118, China.

Hepatitis is a chronic inflammatory liver disease and an important cause of liver fibrosis, which can progress to cirrhosis and even hepatocellular carcinoma if left untreated. However, liver fibrosis is a reversible disease, so finding new intervention targets and molecular markers is the key to preventing and treating liver fibrosis. Ginseng, the roots of Panax ginseng C.

View Article and Find Full Text PDF

Background: Bloodstream infections (BSIs) pose a great challenge to treating patients, especially those with underlying diseases, such as immunodeficiency diseases. Early diagnosis helps to direct precise empirical antibiotic administration and proper clinical management. This study carried out a serum metabolomic analysis using blood specimens sampled from patients with a suspected infection whose routine culture results were later demonstrated to be positive.

View Article and Find Full Text PDF

Prosocial risky behavior (PRB) refers to actions taken at personal risk for the benefit of others or societal welfare, combining risk-taking with prosocial intent, and involving the integrated processing of individual risk and social preferences. Building upon the review and evaluation of the definitions of PRB, existing research tools, theoretical models, and neural mechanisms, this paper elucidates the synergistic interaction and mechanisms of the emotional drive and cognitive reasoning systems in PRB. It constructs a dual-system collaborative model for PRB.

View Article and Find Full Text PDF

VZV IE4 downregulates cellular surface MHC-I via sequestering it to the Golgi complex.

Cell Mol Life Sci

December 2024

Department of Microbiology, Immunology and Infectious Diseases, University of Calgary, Calgary, AB, Canada.

Varicella-zoster virus (VZV) infection downregulates surface major histocompatibility complex class I (MHC-I) expression and retains MHC-I in the Golgi complex of infected cells. However, the underlying mechanism is not fully understood. The VZV IE4 protein is a multifunctional protein that is essential for VZV infection.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!