Tumor classification using phylogenetic methods on expression data.

J Theor Biol

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, Bldg. 38A, Room 8N805, 8600 Rockville Pike, Bethesda, MD 20894, USA.

Published: June 2004

Tumor classification is a well-studied problem in the field of bioinformatics. Developments in the field of DNA chip design have now made it possible to measure the expression levels of thousands of genes in sample tissue from healthy cell lines or tumors. A number of studies have examined the problems of tumor classification: class discovery, the problem of defining a number of classes of tumors using the data from a DNA chip, and class prediction, the problem of accurately classifying an unknown tumor, given expression data from the unknown tumor and from a learning set. The current work has applied phylogenetic methods to both problems. To solve the class discovery problem, we impose a metric on a set of tumors as a function of their gene expression levels, and impose a tree structure on this metric, using standard tree fitting methods borrowed from the field of phylogenetics. Phylogenetic methods provide a simple way of imposing a clear hierarchical relationship on the data, with branch lengths in the classification tree representing the degree of separation witnessed. We tested our method for class discovery on two data sets: a data set of 87 tissues, comprised mostly of small, round, blue-cell tumors (SRBCTs), and a data set of 22 breast tumors. We fit the 87 samples of the first set to a classification tree, which neatly separated into four major clusters corresponding exactly to the four groups of tumors, namely neuroblastomas, rhabdomyosarcomas, Burkitt's lymphomas, and the Ewing's family of tumors. The classification tree built using the breast cancer data separated tumors with BRCA1 mutations from those with BRCA2 mutations, with sporadic tumors separated from both groups and from each other. We also demonstrate the flexibility of the class discovery method with regard to standard resampling methodology such as jackknifing and noise perturbation. To solve the class prediction problem, we built a classification tree on the learning set, and then sought the optimal placement of each test sample within the classification tree. We tested this method on the SRBCT data set, and classified each tumor successfully.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jtbi.2004.02.021DOI Listing

Publication Analysis

Top Keywords

classification tree
20
class discovery
16
tumor classification
12
phylogenetic methods
12
data set
12
data
9
tumors
9
expression data
8
dna chip
8
expression levels
8

Similar Publications

TaWI12 is a member of the wound-induced (WI) protein family, which has been implicated in plant stress responses and developmental processes. Wheat (Triticum aestivum L.) is a crucial staple crop upon which human sustenance relies.

View Article and Find Full Text PDF

Background: The optimal pharmacokinetic and pharmacodynamic (PK/PD) parameters of vancomycin that can improve outcomes in enterococcal infections remain controversial. To clarify the therapeutic target for this antibiotic, this study aimed to determine vancomycin PK/PD parameters associated with efficacy in the early (during 72 h) or later (after 72 h) phase of treatment and nephrotoxicity in enterococcal bloodstream infection patients.

Methods: This multicenter retrospective study reviewed medical records of patients with enterococcal bloodstream infections treated with intravenous vancomycin infusion for at least 72 h between January 2016 and March 2024 at Phramongkutklao Hospital or Nopparatrajathanee Hospital in Bangkok, and Rachaburi Hospital in Rachaburi Province, Thailand.

View Article and Find Full Text PDF

Purpose: To evaluate preoperative and early postoperative factors associated with changes in visual acuity in macula-off rhegmatogenous retinal detachment (RRD) patients.

Methods: A retrospective review of 59 eyes from 59 patients who underwent pars plana vitrectomy (PPV) and silicone oil (SiO) tamponade for macula-off RRD. Preoperative and early postoperative characteristics were analyzed using the classification and regression tree (CART) analysis.

View Article and Find Full Text PDF

The nutria (Myocastor coypus) is a semiaquatic rodent that originally inhabited South America. However, the animals have spread to different continents as alien species, and their numbers are quickly increasing, especially in North America, Europe, and Eastern Asia including Japan. Although nutrias have been suggested to serve as reservoirs for pathogens, including parasites, there have been few reports on this subject.

View Article and Find Full Text PDF

Machine learning reveals the dynamic importance of accessory sequences for outbreak clustering.

mBio

January 2025

Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada.

Unlabelled: Bacterial typing at whole-genome scales is now feasible owing to decreasing costs in high-throughput sequencing and the recent advances in computation. The unprecedented resolution of whole-genome typing is achieved by genotyping the variable segments of bacterial genomes that can fluctuate significantly in gene content. However, due to the transient and hypervariable nature of many accessory elements, the value of the added resolution in outbreak investigations remains disputed.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!