Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes.

BMC Bioinformatics

Department of Theoretical Bioinformatics, German Cancer Research Center, Im Neuenheimer Feld 280, D-69120 Heidelberg, Germany.

Published: November 2005

Background: The extensive use of DNA microarray technology in the characterization of the cell transcriptome is leading to an ever increasing amount of microarray data from cancer studies. Although similar questions for the same type of cancer are addressed in these different studies, a comparative analysis of their results is hampered by the use of heterogeneous microarray platforms and analysis methods.

Results: In contrast to a meta-analysis approach where results of different studies are combined on an interpretative level, we investigate here how to directly integrate raw microarray data from different studies for the purpose of supervised classification analysis. We use median rank scores and quantile discretization to derive numerically comparable measures of gene expression from different platforms. These transformed data are then used for training of classifiers based on support vector machines. We apply this approach to six publicly available cancer microarray gene expression data sets, which consist of three pairs of studies, each examining the same type of cancer, i.e. breast cancer, prostate cancer or acute myeloid leukemia. For each pair, one study was performed by means of cDNA microarrays and the other by means of oligonucleotide microarrays. In each pair, high classification accuracies (> 85%) were achieved with training and testing on data instances randomly chosen from both data sets in a cross-validation analysis. To exemplify the potential of this cross-platform classification analysis, we use two leukemia microarray data sets to show that important genes with regard to the biology of leukemia are selected in an integrated analysis, which are missed in either single-set analysis.

Conclusion: Cross-platform classification of multiple cancer microarray data sets yields discriminative gene expression signatures that are found and validated on a large number of microarray samples, generated by different laboratories and microarray technologies. Predictive models generated by this approach are better validated than those generated on a single data set, while showing high predictive power and improved generalization performance.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1312314PMC
http://dx.doi.org/10.1186/1471-2105-6-265DOI Listing

Publication Analysis

Top Keywords

microarray data
20
gene expression
16
data sets
16
cancer microarray
12
microarray
10
data
10
cancer
8
type cancer
8
classification analysis
8
cross-platform classification
8

Similar Publications

Expanding the clinical spectrum of 19p13.3 microduplication syndrome: a case report highlighting nephrotic syndrome and literature review.

BMC Pediatr

January 2025

Pediatric Internal Medicine, Yantai Yuhuangding Hospital, No.20 Yuhuangding East Road, Zhifu District, Yantai City, Shandong, 264000, China.

Background: Common clinical findings in patients with 19p13.3 duplication include intrauterine growth restriction, intellectual disability, developmental delay, microcephaly, and distinctive facial features. In this study, we report the case of a patient with 19p13.

View Article and Find Full Text PDF

Background: Despite surgical and intravesical chemotherapy interventions, non-muscle invasive bladder cancer (NMIBC) poses a high risk of recurrence, which significantly impacts patient survival. Traditional clinical characteristics alone are inadequate for accurately assessing the risk of NMIBC recurrence, necessitating the development of novel predictive tools.

Methods: We analyzed microarray data of NMIBC samples obtained from the ArrayExpress and GEO databases.

View Article and Find Full Text PDF

Background: Ferroptosis, a recently discovered iron-dependent cell death, is linked to various diseases but its role in endometriosis is still not fully understood.

Methods: In this study, we integrated microarray data of endometriosis from the GEO database and ferroptosis-related genes (FRGs) from the FerrDb database to further investigate the regulation of ferroptosis in endometriosis and its impact on the immune microenvironment. WGCNA identified ferroptosis-related modules, annotated by GO & KEGG.

View Article and Find Full Text PDF

Off-target evaluation is essential in preclinical safety assessments of novel biotherapeutics, supporting lead molecule selection, endpoint selection in toxicology studies, and regulatory requirements for first-in-human trials. Off-target interaction of a therapeutic antibody and antibody derivatives has been historically assessed via the Tissue Cross-Reactivity (TCR) study, in which the candidate molecule is used as a reagent in immunohistochemistry (IHC) to assess binding of the candidate molecule to a panel of human tissue sections. The TCR approach is limited by the performance of the therapeutic as an IHC reagent, which is often suboptimal to outright infeasible.

View Article and Find Full Text PDF

Background: Telomere length is an important indicator of biological age and a complex multi-factor trait. To date, the telomere interactome for comprehending the high-dimensional biological aspects linked to telomere regulation during childhood remains unexplored. Here we describe the multi-omics signatures associated with childhood telomere length.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!