Publications by authors named "Dipak K Dey"

Spatial process models are widely used for modeling point-referenced variables arising from diverse scientific domains. Analyzing the resulting random surface provides deeper insights into the nature of latent dependence within the studied response. We develop Bayesian modeling and inference for rapid changes on the response surface to assess directional curvature along a given trajectory.

View Article and Find Full Text PDF

Human immunodeficiency virus (HIV) dynamics have been the focus of epidemiological and biostatistical research during the past decades to understand the progression of acquired immunodeficiency syndrome (AIDS) in the population. Although there are several approaches for modeling HIV dynamics, one of the most popular is based on Gaussian mixed-effects models because of its simplicity from the implementation and interpretation viewpoints. However, in some situations, Gaussian mixed-effects models cannot (a) capture serial correlation existing in longitudinal data, (b) deal with missing observations properly, and (c) accommodate skewness and heavy tails frequently presented in patients' profiles.

View Article and Find Full Text PDF

The gamma distribution has been extensively used in many areas of applications. In this paper, considering a Bayesian analysis we provide necessary and sufficient conditions to check whether or not improper priors lead to proper posterior distributions. Further, we also discuss sufficient conditions to verify if the obtained posterior moments are finite.

View Article and Find Full Text PDF

Spatio-temporal Poisson models are commonly used for disease mapping. However, after incorporating the spatial and temporal variation, the data do not necessarily have equal mean and variance, suggesting either over- or under-dispersion. In this paper, we propose the Spatio-temporal Conway Maxwell Poisson model.

View Article and Find Full Text PDF

The two-part model and the Tweedie model are two essential methods to analyze the positive continuous and zero-augmented responses. Compared with other continuous zero-augmented models, the zero-augmented gamma model (ZAG) demonstrates its performance on the mass zeros data. In this article, we compare the Bayesian model for continuous data of excess zeros by considering the ZAG and Tweedie model.

View Article and Find Full Text PDF

The inability to distinguish aggressive from indolent prostate cancer is a longstanding clinical problem. Prostate specific antigen (PSA) tests and digital rectal exams cannot differentiate these forms. Because only ∼10% of diagnosed prostate cancer cases are aggressive, existing practice often results in overtreatment including unnecessary surgeries that degrade patients' quality of life.

View Article and Find Full Text PDF

Spatial modeling of consumer response data has gained increased interest recently in the marketing literature. In this paper, we extend the (spatial) multi-scale model by incorporating both spatial and temporal dimensions in the dynamic multi-scale spatiotemporal modeling approach. Our empirical application with a US company's catalog purchase data for the period 1997-2001 reveals a nested geographic market structure that spans geopolitical boundaries such as state borders.

View Article and Find Full Text PDF

Response variables in medical sciences are often bounded, e.g. proportions, rates or fractions of incidence of some disease.

View Article and Find Full Text PDF

In this paper, we introduce a new approach to generate flexible parametric families of distributions. These models arise on competitive and complementary risks scenario, in which the lifetime associated with a particular risk is not observable; rather, we observe only the minimum/maximum lifetime value among all risks. The latent variables have a zero-truncated Poisson distribution.

View Article and Find Full Text PDF

This paper proposes a Bayesian hierarchical cure rate survival model for spatially clustered time to event data. We consider a mixture cure rate model with covariates and a flexible (semi)parametric baseline survival distribution for uncured individuals. The spatial correlation structure is introduced in the form of frailties which follow a Multivariate Conditionally Autoregressive distribution on a pre-specified map.

View Article and Find Full Text PDF

We developed a Bayes factor based approach for the design of non-inferiority clinical trials with a focus on controlling type I error and power. Historical data are incorporated in the Bayesian design via the power prior discussed in Ibrahim and Chen (2000). The properties of the proposed method are examined in detail.

View Article and Find Full Text PDF

In this paper, we present a Weibull link (skewed) model for categorical response data arising from binomial as well as multinomial model. We show that, for such types of categorical data, the most commonly used models (logit, probit and complementary log-log) can be obtained as limiting cases. We further compare the proposed model with some other asymmetrical models.

View Article and Find Full Text PDF

Many modern statistical problems can be cast in the framework of multivariate regression, where the main task is to make statistical inference for a possibly sparse and low-rank coefficient matrix. The low-rank structure in the coefficient matrix is of intrinsic multivariate nature, which, when combined with sparsity, can further lift dimension reduction, conduct variable selection, and facilitate model interpretation. Using a Bayesian approach, we develop a unified sparse and low-rank multivariate regression method to both estimate the coefficient matrix and obtain its credible region for making inference.

View Article and Find Full Text PDF

We develop a general statistical framework for the analysis and inference of large tree-structured data, with a focus on developing asymptotic goodness-of-fit tests. We first propose a consistent statistical model for binary trees, from which we develop a class of invariant tests. Using the model for binary trees, we then construct tests for general trees by using the distributional properties of the Continuum Random Tree, which arises as the invariant limit for a broad class of models for tree-structured data based on conditioned Galton-Watson processes.

View Article and Find Full Text PDF

In multivariate regression models, a sparse singular value decomposition of the regression component matrix is appealing for reducing dimensionality and facilitating interpretation. However, the recovery of such a decomposition remains very challenging, largely due to the simultaneous presence of orthogonality constraints and co-sparsity regularization. By delving into the underlying statistical data generation mechanism, we reformulate the problem as a supervised co-sparse factor analysis, and develop an efficient computational procedure, named sequential factor extraction via co-sparse unit-rank estimation (SeCURE), that completely bypasses the orthogonality requirements.

View Article and Find Full Text PDF

Latent class analysis is used to group categorical data into classes via a probability model. Model selection criteria then judge how well the model fits the data. When addressing incomplete data, the current methodology restricts the imputation to a single, pre-specified number of classes.

View Article and Find Full Text PDF

Our present work proposes a new survival model in a Bayesian context to analyze right-censored survival data for populations with a surviving fraction, assuming that the log failure time follows a generalized extreme value distribution. Many applications require a more flexible modeling of covariate information than a simple linear or parametric form for all covariate effects. It is also necessary to include the spatial variation in the model, since it is sometimes unexplained by the covariates considered in the analysis.

View Article and Find Full Text PDF

In many fields, multi-view datasets, measuring multiple distinct but interrelated sets of characteristics on the same set of subjects, together with data on certain outcomes or phenotypes, are routinely collected. The objective in such a problem is often two-fold: both to explore the association structures of multiple sets of measurements and to develop a parsimonious model for predicting the future outcomes. We study a unified canonical variate regression framework to tackle the two problems simultaneously.

View Article and Find Full Text PDF

In many scientific fields, it is a common practice to collect a sequence of 0-1 binary responses from a subject across time, space, or a collection of covariates. Researchers are interested in finding out how the expected binary outcome is related to covariates, and aim at better prediction in the future 0-1 outcomes. Gaussian processes have been widely used to model nonlinear systems; in particular to model the latent structure in a binary regression model allowing nonlinear functional relationship between covariates and the expectation of binary outcomes.

View Article and Find Full Text PDF

A Bayesian hierarchical model is developed for count data with spatial and temporal correlations as well as excessive zeros, uneven sampling intensities, and inference on missing spots. Our contribution is to develop a model on zero-inflated count data that provides flexibility in modeling spatial patterns in a dynamic manner and also improves the computational efficiency via dimension reduction. The proposed methodology is of particular importance for studying species presence and abundance in the field of ecological sciences.

View Article and Find Full Text PDF

Embryonic stem (ES) cells are an important factor in the development of cell-based therapeutic strategies. In this work, the use of digital holographic interferometric microscopy and statistical identification for automatic discrimination of ES cells and fibroblast (FB) cells is discussed in detail. The proposed algorithm first reduces the complex data structure to lower dimensions.

View Article and Find Full Text PDF

In this paper, we consider a piecewise exponential model (PEM) with random time grid to develop a full semiparametric Bayesian cure rate model. An elegant mechanism enjoying several attractive features for modeling the randomness of the time grid of the PEM is assumed. To model the prior behavior of the failure rates of the PEM we assume a hierarchical modeling approach that allows us to control the degree of parametricity in the right tail of the survival curve.

View Article and Find Full Text PDF

Unhealthy alcohol use is one of the leading causes of morbidity and mortality in the United States. Brief interventions with high-risk drinkers during an emergency department (ED) visit are of great interest due to their possible efficacy and low cost. In a collaborative study with patients recruited at 14 academic ED across the United States, we examined the self-reported number of drinks per week by each patient following the exposure to a brief intervention.

View Article and Find Full Text PDF

Updating categorical soil maps is necessary for providing current, higher-quality soil data to agricultural and environmental management but may not require a costly thorough field survey because latest legacy maps may only need limited corrections. This study suggests a Markov chain random field (MCRF) sequential cosimulation (Co-MCSS) method for updating categorical soil maps using limited survey data provided that qualified legacy maps are available. A case study using synthetic data demonstrates that Co-MCSS can appreciably improve simulation accuracy of soil types with both contributions from a legacy map and limited sample data.

View Article and Find Full Text PDF

Multiplexed biomarker protein detection holds unrealized promise for clinical cancer diagnostics due to lack of suitable measurement devices and lack of rigorously validated protein panels. Here we report an ultrasensitive electrochemical microfluidic array optimized to measure a four-protein panel of biomarker proteins, and we validate the protein panel for accurate oral cancer diagnostics. Unprecedented ultralow detection into the 5-50 fg·mL(-1) range was achieved for simultaneous measurement of proteins interleukin 6 (IL-6), IL-8, vascular endothelial growth factor (VEGF), and VEGF-C in diluted serum.

View Article and Find Full Text PDF