When quantitative longitudinal traits are risk factors for disease progression and subject to random biological variation, joint model analysis of time-to-event and longitudinal traits can effectively identify direct and/or indirect genetic association of single nucleotide polymorphisms (SNPs) with time-to-event. We present a joint model that integrates: (1) a multivariate linear mixed model describing trajectories of multiple longitudinal traits as a function of time, SNP effects, and subject-specific random effects and (2) a frailty Cox survival model that depends on SNPs, longitudinal trajectory effects, and subject-specific frailty accounting for dependence among multiple time-to-event traits. Motivated by complex genetic architecture of type 1 diabetes complications (T1DC) observed in the Diabetes Control and Complications Trial (DCCT), we implement a 2-stage approach to inference with bootstrap joint covariance estimation and develop a hypothesis testing procedure to classify direct and/or indirect SNP association with each time-to-event trait.
View Article and Find Full Text PDFPost-GWAS analysis, in many cases, focuses on fine-mapping targeted genetic regions discovered at GWAS-stage; that is, the aim is to pinpoint potential causal variants and susceptibility genes for complex traits and disease outcomes using next-generation sequencing (NGS) technologies. Large-scale GWAS cohorts are necessary to identify target regions given the typically modest genetic effect sizes. In this context, two-phase sampling design and analysis is a cost-reduction technique that utilizes data collected during phase 1 GWAS to select an informative subsample for phase 2 sequencing.
View Article and Find Full Text PDFThe X-chromosome is often excluded from genome-wide association studies because of analytical challenges. Some of the problems, such as the random, skewed, or no X-inactivation model uncertainty, have been investigated. Other considerations have received little to no attention, such as the value in considering nonadditive and gene-sex interaction effects, and the inferential consequence of choosing different baseline alleles (i.
View Article and Find Full Text PDFX-chromosome is often excluded from the so called "whole-genome" association studies due to the differences it exhibits between males and females. One particular analytical challenge is the unknown status of X-inactivation, where one of the two X-chromosome variants in females may be randomly selected to be silenced. In the absence of biological evidence in favor of one specific model, we consider a Bayesian model averaging framework that offers a principled way to account for the inherent model uncertainty, providing model averaging-based posterior density intervals and Bayes factors.
View Article and Find Full Text PDFWe evaluate two-phase designs to follow-up findings from genome-wide association study (GWAS) when the cost of regional sequencing in the entire cohort is prohibitive. We develop novel expectation-maximization-based inference under a semiparametric maximum likelihood formulation tailored for post-GWAS inference. A GWAS-SNP (where SNP is single nucleotide polymorphism) serves as a surrogate covariate in inferring association between a sequence variant and a normally distributed quantitative trait (QT).
View Article and Find Full Text PDFMotivated by genetic association studies of pleiotropy, we propose a Bayesian latent variable approach to jointly study multiple outcomes. The models studied here can incorporate both continuous and binary responses, and can account for serial and cluster correlations. We consider Bayesian estimation for the model parameters, and we develop a novel MCMC algorithm that builds upon hierarchical centering and parameter expansion techniques to efficiently sample from the posterior distribution.
View Article and Find Full Text PDFPleiotropy, which occurs when a single genetic factor influences multiple phenotypes, is present in many genetic studies of complex human traits. Longitudinal family data, such as the Genetic Analysis Workshop 18 data, combine the features of longitudinal studies in individuals and cross-sectional studies in families, thus providing richer information about the genetic and environmental factors associated with the trait of interest. We recently proposed a Bayesian latent variable methodology for the study of pleiotropy, in the presence of longitudinal and family correlation.
View Article and Find Full Text PDFIn focused studies designed to follow up associations detected in a genome-wide association study (GWAS), investigators can proceed to fine-map a genomic region by targeted sequencing or dense genotyping of all variants in the region, aiming to identify a functional sequence variant. For the analysis of a quantitative trait, we consider a Bayesian approach to fine-mapping study design that incorporates stratification according to a promising GWAS tag SNP in the same region. Improved cost-efficiency can be achieved when the fine-mapping phase incorporates a two-stage design, with identification of a smaller set of more promising variants in a subsample taken in stage 1, followed by their evaluation in an independent stage 2 subsample.
View Article and Find Full Text PDFBy systematic examination of common tag single-nucleotide polymorphisms (SNPs) across the genome, the genome-wide association study (GWAS) has proven to be a successful approach to identify genetic variants that are associated with complex diseases and traits. Although the per base pair cost of sequencing has dropped dramatically with the advent of the next-generation technologies, it may still only be feasible to obtain DNA sequence data for a portion of available study subjects due to financial constraints. Two-phase sampling designs have been used frequently in large-scale surveys and epidemiological studies where certain variables are too costly to be measured on all subjects.
View Article and Find Full Text PDFThe study of dependence between random variables is a mainstay in statistics. In many cases, the strength of dependence between two or more random variables varies according to the values of a measured covariate. We propose inference for this type of variation using a conditional copula model where the copula function belongs to a parametric copula family and the copula parameter varies with the covariate.
View Article and Find Full Text PDFThis paper considers inference methods for case-control logistic regression in longitudinal setups. The motivation is provided by an analysis of plains bison spatial location as a function of habitat heterogeneity. The sampling is done according to a longitudinal matched case-control design in which, at certain time points, exactly one case, the actual location of an animal, is matched to a number of controls, the alternative locations that could have been reached.
View Article and Find Full Text PDFIEEE Trans Image Process
July 2006
Cellular automata are discrete dynamical systems which evolve on a discrete grid. Recent studies have shown that cellular automata with relatively simple rules can produce highly complex patterns. We develop likelihood-based methods for estimating rules of cellular automata aimed at the re-generation of observed regular patterns.
View Article and Find Full Text PDFThe multiplicity problem has become increasingly important in genetic studies as the capacity for high-throughput genotyping has increased. The control of False Discovery Rate (FDR) (Benjamini and Hochberg. [1995] J.
View Article and Find Full Text PDFLifetime Data Anal
March 2006
The competing risks model is useful in settings in which individuals/units may die/fail for different reasons. The cause specific hazard rates are taken to be piecewise constant functions. A complication arises when some of the failures are masked within a group of possible causes.
View Article and Find Full Text PDF