Introduction: The identification of metabolomic biomarkers predictive of cancer patient response to therapy and of disease stage has been pursued as a "holy grail" of modern oncology, relying on the metabolic dysfunction that characterizes cancer progression. In spite of the evaluation of many candidate biomarkers, however, determination of a consistent set with practical clinical utility has proven elusive.
Objective: In this study, we systematically examine the combined role of data pre-treatment and imputation methods on the performance of multivariate data analysis methods and their identification of potential biomarkers.