A two-stage approach of gene network analysis for high-dimensional heterogeneous data.

Biostatistics

Quantitative Biomedical Research Center, Department of Clinical Sciences, Department of Bioinformatics, Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, 6000 Harry Hines Blvd, Dallas, TX 75390, USA.

Published: April 2018

Gaussian graphical models have been widely used to construct gene regulatory networks from gene expression data. Most existing methods for Gaussian graphical models are designed to model homogeneous data, assuming a single Gaussian distribution. In practice, however, data may consist of gene expression studies with unknown confounding factors, such as study cohort, microarray platforms, experimental batches, which produce heterogeneous data, and hence lead to false positive edges or low detection power in resulting network, due to those unknown factors. To overcome this problem and improve the performance in constructing gene networks, we propose a two-stage approach to construct a gene network from heterogeneous data. The first stage is to perform a clustering analysis in order to assign samples to a few clusters where the samples in each cluster are approximately homogeneous, and the second stage is to conduct an integrative analysis of networks from each cluster. In particular, we first apply a model-based clustering method using the singular value decomposition for high-dimensional data, and then integrate the networks from each cluster using the integrative $\psi$-learning method. The proposed method is based on an equivalent measure of partial correlation coefficients in Gaussian graphical models, which is computed with a reduced conditional set and thus it is useful for high-dimensional data. We compare the proposed two-stage learning approach with some existing methods in various simulation settings, and demonstrate the robustness of the proposed method. Finally, it is applied to integrate multiple gene expression studies of lung adenocarcinoma to identify potential therapeutic targets and treatment biomarkers.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5862270PMC
http://dx.doi.org/10.1093/biostatistics/kxx033DOI Listing

Publication Analysis

Top Keywords

heterogeneous data
12
gaussian graphical
12
graphical models
12
gene expression
12
two-stage approach
8
gene network
8
data
8
construct gene
8
existing methods
8
expression studies
8

Similar Publications

Previous observational studies have suggested at a potential link between migraine, particularly migraine with aura, and the susceptibility to early-onset ischemic stroke. We aimed to investigate the causal effects of genetically determined migraine and its subtypes on the risk of early-onset ischemic stroke using the two-sample Mendelian randomization method. Genetic instrumental variables associated with migraine and its subtypes were acquired from two sources with the largest sample sizes available.

View Article and Find Full Text PDF

Immunoglobulin A nephropathy (IgAN) is the most common primary glomerulonephritis worldwide with heterogeneous histopathological phenotypes. Although IgAN with membranoproliferative glomerulonephritis (MPGN)-like features has been reported in children and adults, treatment strategies for this rare IgAN subtype have not been established. Here, we present the case of a 56-year-old man with no history of kidney disease who initially presented with nephrotic syndrome.

View Article and Find Full Text PDF

Accurate characterisation of seismic source mechanisms in mining environments is crucial for effective hazard mitigation, but it is complicated by the presence of anisotropic geological conditions. Neglecting anisotropic effects during moment tensor (MT) inversion introduces significant distortions in the retrieved source characteristics. In this study, we investigated the impact of ignoring anisotropy during MT inversion on the reliability of hazard assessment.

View Article and Find Full Text PDF

Background: We performed a systematic review and network meta-analysis (NMA) of individualized patient data (IPD) to inform the development of evidence-informed clinical practice recommendations.

Methods: We searched MEDLINE, Embase, and Cochrane Central in October 2023 to identify RCTs comparing Hartmann's resection (HR), primary resection and anastomosis (PRA), or laparoscopic peritoneal lavage (LPL) among patients with class Ib-IV Hinchey diverticulitis. Outcomes of interest were prioritized by an international, multidisciplinary panel including two patient partners.

View Article and Find Full Text PDF

The Gift of Time, How Do I Want to Spend It? Exploring Preferences for Time Allocation Among Women with and without a Breast Cancer Diagnosis.

Appl Health Econ Health Policy

December 2024

Health Systems and Health Economics, School of Public Health, Curtin University, Bentley, Perth, Australia.

Background: Women's preferences for time allocation reveal how they would like to prioritise market work, family life, and other competing activities. Whilst preferences may not always directly translate to behaviour, they are an important determinant of intention to act.

Objective: We present the first study to apply a discrete choice experiment (DCE) to investigate time allocation preferences among women diagnosed with breast cancer and women without a cancer diagnosis.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!