Decision tree-based method for integrating gene expression, demographic, and clinical data to determine disease endotypes.

BMC Syst Biol

National Health and Environmental Effects Research Laboratory - Integrated Systems Toxicology Division, U,S, Environmental Protection Agency, Research Triangle Park, Durham, NC 27711, USA.

Published: November 2013

Background: Complex diseases are often difficult to diagnose, treat and study due to the multi-factorial nature of the underlying etiology. Large data sets are now widely available that can be used to define novel, mechanistically distinct disease subtypes (endotypes) in a completely data-driven manner. However, significant challenges exist with regard to how to segregate individuals into suitable subtypes of the disease and understand the distinct biological mechanisms of each when the goal is to maximize the discovery potential of these data sets.

Results: A multi-step decision tree-based method is described for defining endotypes based on gene expression, clinical covariates, and disease indicators using childhood asthma as a case study. We attempted to use alternative approaches such as the Student's t-test, single data domain clustering and the Modk-prototypes algorithm, which incorporates multiple data domains into a single analysis and none performed as well as the novel multi-step decision tree method. This new method gave the best segregation of asthmatics and non-asthmatics, and it provides easy access to all genes and clinical covariates that distinguish the groups.

Conclusions: The multi-step decision tree method described here will lead to better understanding of complex disease in general by allowing purely data-driven disease endotypes to facilitate the discovery of new mechanisms underlying these diseases. This application should be considered a complement to ongoing efforts to better define and diagnose known endotypes. When coupled with existing methods developed to determine the genetics of gene expression, these methods provide a mechanism for linking genetics and exposomics data and thereby accounting for both major determinants of disease.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4228284PMC
http://dx.doi.org/10.1186/1752-0509-7-119DOI Listing

Publication Analysis

Top Keywords

gene expression
12
multi-step decision
12
decision tree-based
8
tree-based method
8
disease endotypes
8
method described
8
clinical covariates
8
decision tree
8
tree method
8
disease
7

Similar Publications

The Ataxia-telangiectasia mutated (ATM) is the most important gene for repairing the DNA in Myelodysplastic Neoplasm.

DNA Repair (Amst)

January 2025

Cancer Cytogenomic Laboratory, Center for Research and Drug Development (NPDM), Federal University of Ceara, Fortaleza, Ceara, Brazil; Post-Graduate Program in Medical Science, Federal University of Ceara, Fortaleza, Ceara, Brazil; Post-Graduate Program of Pathology, Federal University of Ceara, Fortaleza, Ceara, Fortaleza, Ceara, Brazil; Post-Graduate Program of Translational Medicine, Federal University of Ceara, Fortaleza, Ceara, Brazil.

Myelodysplastic Neoplasm (MDS) is a cancer associated with aging, often leading to acute myeloid leukemia (AML). One of its hallmarks is hypermethylation, particularly in genes responsible for DNA repair. This study aimed to evaluate the methylation and mutation status of DNA repair genes (single-strand - XPA, XPC, XPG, CSA, CSB and double-strand - ATM, BRCA1, BRCA2, LIG4, RAD51) in MDS across three patient cohorts (Cohort A-56, Cohort B-100, Cohort C-76), using methods like pyrosequencing, real-time PCR, immunohistochemistry, and mutation screening.

View Article and Find Full Text PDF

The St. Lawrence Estuary (SLE) beluga () population in Canada is Endangered, and endocrine disrupting contaminants, such as polychlorinated biphenyls (PCBs), polybrominated diphenyl ethers (PBDEs), and other halogenated flame retardants, have been identified as a threat to the recovery of this population. Here, potential impacts of these contaminants on SLE beluga were evaluated by comparing skin transcriptome profiles and biological pathways between this population and a population less exposed to contaminants (Eastern Beaufort Sea) used as a reference.

View Article and Find Full Text PDF

The global prevalence of heart failure is still growing, which imposes a heavy economic burden. The role of microRNA-146b (miR-146b) in HF remain largely unknown. This study aims to explore the role and mechanism of miR-146b in HF.

View Article and Find Full Text PDF

Proteomic Characterization of NEDD4 Unveils Its Potential Novel Downstream Effectors in Gastric Cancer.

J Proteome Res

January 2025

Graduate School of Analytical Science and Technology (GRAST), Chungnam National University, Daejeon 34134, Republic of Korea.

The E3 ubiquitin ligase neural precursor cell-expressed developmentally down-regulated 4 (NEDD4) is involved in various cancer signaling pathways, including PTEN/AKT. However, its role in promoting gastric cancer (GC) progression is unclear. This study was conducted to elucidate the role of NEDD4 in GC progression.

View Article and Find Full Text PDF

Objective: Aim: Testing Cordia myxa extract on colon cancer cell line and caspase-3 gene and COX-2 protein expression.

Patients And Methods: Materials and Methods: This study used Cordia myxa ethanolic extract at various dosages on SW480 cells. Cell proliferation was measured using MTT, also examined effect of Cordia myxa extract on caspase-3 gene expression using quantitative real-time polymerase chain reaction.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!