Subtype prediction in pediatric acute myeloid leukemia: classification using differential network rank conservation revisited.

Askar Obulkasim Maarten Fornerod Michel C Zwaan Dirk Reinhardt Marry M van den Heuvel-Eibrink

BMC Bioinformatics

Department of Pediatric Oncology/Hematology, Erasmus-MC Sophia Childrens Hospital, Rotterdam, The Netherlands.

Published: September 2015

Background: One of the most important application spectrums of transcriptomic data is cancer phenotype classification. Many characteristics of transcriptomic data, such as redundant features and technical artifacts, make over-fitting commonplace. Promising classification results often fail to generalize across datasets with different sources, platforms, or preprocessing. Recently a novel differential network rank conservation (DIRAC) algorithm to characterize cancer phenotypes using transcriptomic data. DIRAC is a member of a family of algorithms that have shown useful for disease classification based on the relative expression of genes. Combining the robustness of this family's simple decision rules with known biological relationships, this systems approach identifies interpretable, yet highly discriminate networks. While DIRAC has been briefly employed for several classification problems in the original paper, the potentials of DIRAC in cancer phenotype classification, and especially robustness against artifacts in transcriptomic data have not been fully characterized yet.

Results: In this study we thoroughly investigate the potentials of DIRAC by applying it to multiple datasets, and examine the variations in classification performances when datasets are (i) treated and untreated for batch effect; (ii) preprocessed with different techniques. We also propose the first DIRAC-based classifier to integrate multiple networks. We show that the DIRAC-based classifier is very robust in the examined scenarios. To our surprise, the trained DIRAC-based classifier even translated well to a dataset with different biological characteristics in the presence of substantial batch effects that, as shown here, plagued the standard expression value based classifier. In addition, the DIRAC-based classifier, because of the integrated biological information, also suggests pathways to target in specific subtypes, which may enhance the establishment of personalized therapy in diseases such as pediatric AML. In order to better comprehend the prediction power of the DIRAC-based classifier in general, we also performed classifications using publicly available datasets from breast and lung cancer. Furthermore, multiple well-known classification algorithms were utilized to create an ideal test bed for comparing the DIRAC-based classifier with the standard gene expression value based classifier. We observed that the DIRAC-based classifier greatly outperforms its rival.

Conclusions: Based on our experiments with multiple datasets, we propose that DIRAC is a promising solution to the lack of generalizability in classification efforts that uses transcriptomic data. We believe that superior performances presented in this study may motivate other to initiate a new aline of research to explore the untapped power of DIRAC in a broad range of cancer types.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4580220	PMC
http://dx.doi.org/10.1186/s12859-015-0737-3	DOI Listing

Publication Analysis

Top Keywords

dirac-based classifier

transcriptomic data

classification

classifier

differential network

network rank

rank conservation

cancer phenotype

phenotype classification

potentials dirac

Similar Publications

The Fermi-Dirac distribution provides a calibrated probabilistic output for binary classifiers.

Proc Natl Acad Sci U S A

August 2021

IBM Research, IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598;

Sung-Cheol Kim Adith S Arun Mehmet Eren Ahsen Robert Vogel Gustavo Stolovitzky

Binary classification is one of the central problems in machine-learning research and, as such, investigations of its general statistical properties are of interest. We studied the ranking statistics of items in binary classification problems and observed that there is a formal and surprising relationship between the probability of a sample belonging to one of the two classes and the Fermi-Dirac distribution determining the probability that a fermion occupies a given single-particle quantum state in a physical system of noninteracting fermions. Using this equivalence, it is possible to compute a calibrated probabilistic output for binary classifiers.

View Article and Find Full Text PDF

Similar Publications

Subtype prediction in pediatric acute myeloid leukemia: classification using differential network rank conservation revisited.

BMC Bioinformatics

September 2015

Department of Pediatric Oncology/Hematology, Erasmus-MC Sophia Childrens Hospital, Rotterdam, The Netherlands.

Askar Obulkasim Maarten Fornerod Michel C Zwaan Dirk Reinhardt Marry M van den Heuvel-Eibrink

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!