A hierarchical unsupervised growing neural network for clustering gene expression patterns.

Bioinformatics

Bioinformatics, CNIO, Ctra. Majadahonda-Pozuelo, Km 2, Majadahonda, 28220 Madrid Protein Design Group CNB-CSIC, 28049 Madrid, Spain.

Published: February 2001

Motivation: We describe a new approach to the analysis of gene expression data coming from DNA array experiments, using an unsupervised neural network. DNA array technologies allow monitoring thousands of genes rapidly and efficiently. One of the interests of these studies is the search for correlated gene expression patterns, and this is usually achieved by clustering them. The Self-Organising Tree Algorithm, (SOTA) (Dopazo,J. and Carazo,J.M. (1997) J. Mol. Evol., 44, 226-233), is a neural network that grows adopting the topology of a binary tree. The result of the algorithm is a hierarchical cluster obtained with the accuracy and robustness of a neural network.

Results: SOTA clustering confers several advantages over classical hierarchical clustering methods. SOTA is a divisive method: the clustering process is performed from top to bottom, i.e. the highest hierarchical levels are resolved before going to the details of the lowest levels. The growing can be stopped at the desired hierarchical level. Moreover, a criterion to stop the growing of the tree, based on the approximate distribution of probability obtained by randomisation of the original data set, is provided. By means of this criterion, a statistical support for the definition of clusters is proposed. In addition, obtaining average gene expression patterns is a built-in feature of the algorithm. Different neurons defining the different hierarchical levels represent the averages of the gene expression patterns contained in the clusters. Since SOTA runtimes are approximately linear with the number of items to be classified, it is especially suitable for dealing with huge amounts of data. The method proposed is very general and applies to any data providing that they can be coded as a series of numbers and that a computable measure of similarity between data items can be used.

Availability: A server running the program can be found at: http://bioinfo.cnio.es/sotarray.

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/17.2.126DOI Listing

Publication Analysis

Top Keywords

gene expression
20
expression patterns
16
neural network
12
dna array
8
hierarchical levels
8
hierarchical
6
clustering
5
gene
5
expression
5
data
5

Similar Publications

Newly identified c-di-GMP pathway putative EAL domain gene STM0343 regulates stress resistance and virulence in Salmonella enterica serovar Typhimurium.

Vet Res

January 2025

National and Regional Joint Engineering Laboratory for Medicament of Zoonoses Prevention and Control, Key Laboratory of Zoonoses, Ministry of Agriculture, Key Laboratory of Zoonoses Prevention and Control of Guangdong Province, Key Laboratory of Animal Vaccine Development, Ministry of Agriculture, College of Veterinary Medicine, South China Agricultural University, Guangzhou, 510642, China.

S. Typhimurium is a significant zoonotic pathogen, and its survival and transmission rely on stress resistance and virulence factors. Therefore, identifying key regulatory elements is crucial for preventing and controlling S.

View Article and Find Full Text PDF

Background: Inclusion Body Myositis is an acquired muscle disease. Its pathogenesis is unclear due to the co-existence of inflammation, muscle degeneration and mitochondrial dysfunction. We aimed to provide a more advanced understanding of the disease by combining multi-omics analysis with prior knowledge.

View Article and Find Full Text PDF

A rare haplotype of the GJD3 gene segregating in familial Meniere's disease interferes with connexin assembly.

Genome Med

January 2025

Otology & Neurotology Group CTS495, Instituto de Investigación Biosanitario, Ibs.GRANADA, Universidad de Granada, 18071, Granada, Spain.

Background: Familial Meniere's disease (FMD) is a rare polygenic disorder of the inner ear. Mutations in the connexin gene family, which encodes gap junction proteins, can also cause hearing loss, but their role in FMD is largely unknown.

Methods: We retrieved exome sequencing data from 94 individuals in 70 Meniere's disease (MD) families.

View Article and Find Full Text PDF

Background: Aspergillus niger is an important industrial filamentous fungus used to produce organic acids and enzymes. A wide dynamic range of promoters, particularly strong promoters, are required for fine-tuning the regulation of gene expression to balance metabolic flux and achieve the high yields of desired products. However, the limited understanding of promoter architectures and activities restricts the efficient transcription regulation of targets in strain engineering in A.

View Article and Find Full Text PDF

SNORA37/CMTR1/ELAVL1 feedback loop drives gastric cancer progression via facilitating CD44 alternative splicing.

J Exp Clin Cancer Res

January 2025

Department of Pathology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, 1277 Jiefang Avenue, Wuhan, 430022, Hubei Province, People's Republic of China.

Background: Emerging evidence shows that small nucleolar RNA (snoRNA), a type of highly conserved non-coding RNA, is involved in tumorigenesis and aggressiveness. However, the roles of snoRNAs in regulating alternative splicing crucial for cancer progression remain elusive.

Methods: High-throughput RNA sequencing and comprehensive analysis were performed to identify crucial snoRNAs and downstream alternative splicing events.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!