Selective of informative metabolites using random forests based on model population analysis.

Talanta

Research Center of Modernization of Traditional Chinese Medicines, Central South University, Changsha 410083, PR China. Electronic address:

Published: December 2013

One of the main goals of metabolomics studies is to discover informative metabolites or biomarkers, which may be used to diagnose diseases and to find out pathology. Sophisticated feature selection approaches are required to extract the information hidden in such complex 'omics' data. In this study, it is proposed a new and robust selective method by combining random forests (RF) with model population analysis (MPA), for selecting informative metabolites from three metabolomic datasets. According to the contribution to the classification accuracy, the metabolites were classified into three kinds: informative, no-informative, and interfering metabolites. Based on the proposed method, some informative metabolites were selected for three datasets; further analyses of these metabolites between healthy and diseased groups were then performed, showing by T-test that the P values for all these selected metabolites were lower than 0.05. Moreover, the informative metabolites identified by the current method were demonstrated to be correlated with the clinical outcome under investigation. The source codes of MPA-RF in Matlab can be freely downloaded from http://code.google.com/p/my-research-list/downloads/list.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.talanta.2013.07.070DOI Listing

Publication Analysis

Top Keywords

informative metabolites
20
metabolites
9
random forests
8
model population
8
population analysis
8
informative
5
selective informative
4
metabolites random
4
forests based
4
based model
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!