Publications by authors named "Jiguo Cao"

Background: Dealing with the high dimension of both neuroimaging data and genetic data is a difficult problem in the association of genetic data to neuroimaging. In this article, we tackle the latter problem with an eye toward developing solutions that are relevant for disease prediction. Supported by a vast literature on the predictive power of neural networks, our proposed solution uses neural networks to extract from neuroimaging data features that are relevant for predicting Alzheimer's Disease (AD) for subsequent relation to genetics.

View Article and Find Full Text PDF

Screening mammography is the primary preventive strategy for early detection of breast cancer and an essential input to breast cancer risk prediction and application of prevention/risk management guidelines. Identifying regions of interest within mammogram images that are associated with 5- or 10-year breast cancer risk is therefore clinically meaningful. The problem is complicated by the irregular boundary issue posed by the semi-circular domain of the breast area within mammograms.

View Article and Find Full Text PDF

Dementia of Alzheimer's Type (DAT) is a complex disorder influenced by numerous factors, and it is difficult to predict individual progression trajectory from normal or mildly impaired cognition to DAT. An in-depth examination of multiple modalities of data may yield an accurate estimate of time-to-conversion to DAT for preclinical subjects at various stages of disease development. We used a deep-learning model designed for survival analyses to predict subjects' time-to-conversion to DAT using the baseline data of 401 subjects with 63 features from MRI, genetic, and CDC (Cognitive tests, Demographic, and CSF) data in the Alzheimer's Disease Neuroimaging Initiative (ADNI) database.

View Article and Find Full Text PDF

Unlabelled: The selection of smoothing parameter is central to the estimation of penalized splines. The best value of the smoothing parameter is often the one that optimizes a smoothness selection criterion, such as generalized cross-validation error (GCV) and restricted likelihood (REML). To correctly identify the global optimum rather than being trapped in an undesired local optimum, grid search is recommended for optimization.

View Article and Find Full Text PDF

We address two computational issues common to open-population -mixture models, hidden integer-valued autoregressive models, and some hidden Markov models. The first issue is computation time, which can be dramatically improved through the use of a fast Fourier transform. The second issue is tractability of the model likelihood function for large numbers of hidden states, which can be solved by improving numerical stability of calculations.

View Article and Find Full Text PDF

Ordinary differential equations (ODEs) are widely used to characterize the dynamics of complex systems in real applications. In this article, we propose a novel joint estimation approach for generalized sparse additive ODEs where observations are allowed to be non-Gaussian. The new method is unified with existing collocation methods by considering the likelihood, ODE fidelity and sparse regularization simultaneously.

View Article and Find Full Text PDF

Time-varying biomarkers reflect important information on disease progression over time. Dynamic prediction for event occurrence on a real-time basis, utilizing time-varying information, is crucial in making accurate clinical decisions. Functional principal component analysis (FPCA) has been widely adopted in the literature for extracting features from time-varying biomarker trajectories.

View Article and Find Full Text PDF

Background: The increasing availability of databases containing both magnetic resonance imaging (MRI) and genetic data allows researchers to utilize multimodal data to better understand the characteristics of dementia of Alzheimer's type (DAT).

Objective: The goal of this study was to develop and analyze novel biomarkers that can help predict the development and progression of DAT.

Methods: We used feature selection and ensemble learning classifier to develop an image/genotype-based DAT score that represents a subject's likelihood of developing DAT in the future.

View Article and Find Full Text PDF

Purpose: To determine whether generalised additive models of location, scale and shape (GAMLSS) developed for pulmonary diffusing capacity are superior to segmented (piecewise) regression models, and to update reference equations for pulmonary diffusing capacity for carbon monoxide (DLCO) and nitric oxide (DLNO), which may be affected by the equipment used for its measurement.

Methods: Data were pooled from five studies that developed reference equations for DLCO and DLNO (n=530 F/546 M; 5-95 years old, body mass index 12.4-39.

View Article and Find Full Text PDF

Screening mammography aims to identify breast cancer early and secondarily measures breast density to classify women at higher or lower than average risk for future breast cancer in the general population. Despite the strong association of individual mammography features to breast cancer risk, the statistical literature on mammogram imaging data is limited. While functional principal component analysis (FPCA) has been studied in the literature for extracting image-based features, it is conducted independently of the time-to-event response variable.

View Article and Find Full Text PDF

Existing survival models involving functional covariates typically rely on the Cox proportional hazards structure and the assumption of right censorship. Motivated by the aim of predicting the time of conversion to Alzheimer's disease from sparse biomarker trajectories in patients with mild cognitive impairment, we propose a functional mixture cure rate model with both functional and scalar covariates for interval censoring and sparsely sampled functional data. To estimate the nonparametric coefficient function that depicts the effect of the shape of the trajectories on the survival outcome and cure probability, we utilize the functional principal component analysis to extract the functional features from the sparsely and irregularly sampled trajectories.

View Article and Find Full Text PDF

In many clinical studies, longitudinal biomarkers are often used to monitor the progression of a disease. For example, in a kidney transplant study, the glomerular filtration rate (GFR) is used as a longitudinal biomarker to monitor the progression of the kidney function and the patient's state of survival that is characterized by multiple time-to-event outcomes, such as kidney transplant failure and death. It is known that the joint modelling of longitudinal and survival data leads to a more accurate and comprehensive estimation of the covariates' effect.

View Article and Find Full Text PDF

With mammography being the primary breast cancer screening strategy, it is essential to make full use of the mammogram imaging data to better identify women who are at higher and lower than average risk. Our primary goal in this study is to extract mammogram-based features that augment the well-established breast cancer risk factors to improve prediction accuracy. In this article, we propose a supervised functional principal component analysis (sFPCA) over triangulations method for extracting features that are ordered by the magnitude of association with the failure time outcome.

View Article and Find Full Text PDF
Article Synopsis
  • - This paper focuses on improving outcomes for patients with chronic kidney disease after kidney transplantation by exploring how to prolong the life of donated kidneys, which are in high demand due to a large number of patients waiting for transplants.
  • - The research aims to monitor kidney function using the glomerular filtration rate and address multiple patient outcomes related to transplant failure and survival, which traditional models may not effectively analyze together.
  • - A new functional joint model is proposed that includes a longitudinal sub-model and a competing-risks sub-model, allowing for a more comprehensive analysis of patient data and offering valuable insights for clinical practices in post-transplant care.
View Article and Find Full Text PDF

Alzheimer's disease (AD) is a worldwide prevalent age-related neurodegenerative disease with no available cure yet. Early prognosis is therefore crucial for planning proper clinical intervention. It is especially true for people diagnosed with mild cognitive impairment, to whom the prediction of whether and when the future disease onset would happen is particularly valuable.

View Article and Find Full Text PDF

Unlabelled: Ordinary differential equation (ODE) models are popularly used to describe complex dynamical systems. When estimating ODE parameters from noisy data, a common distribution assumption is using the Gaussian distribution. It is known that the Gaussian distribution is not robust when abnormal data exist.

View Article and Find Full Text PDF

We develop a Bayesian bivariate spatial model for multivariate regression analysis applicable to studies examining the influence of genetic variation on brain structure. Our model is motivated by an imaging genetics study of the Alzheimer's Disease Neuroimaging Initiative (ADNI), where the objective is to examine the association between images of volumetric and cortical thickness values summarizing the structure of the brain as measured by magnetic resonance imaging (MRI) and a set of 486 single nucleotide polymorphism (SNPs) from 33 Alzheimer's disease (AD) candidate genes obtained from 632 subjects. A bivariate spatial process model is developed to accommodate the correlation structures typically seen in structural brain imaging data.

View Article and Find Full Text PDF

Unlike chemotherapy, the maximum tolerated dose (MTD) of molecularly targeted agents and immunotherapy may not pose significant clinical benefit over the lower doses. By simultaneously considering both toxicity and efficacy endpoints, phase I/II trials can identify a more clinically meaningful dose for subsequent phase II trials than traditional toxicity-based phase I trials in terms of risk-benefit tradeoff. To strengthen and simplify the current practice of phase I/II trials, we propose a utility-based toxicity probability interval (uTPI) design for finding the optimal biological dose, based on a numerical utility that provides a clinically meaningful, one-dimensional summary representation of the patient's bivariate toxicity and efficacy outcome.

View Article and Find Full Text PDF

In longitudinal studies, the values of biomarkers are often informatively missing due to dropout. The conventional functional principal component analysis typically disregards the missing information and simply treats the unobserved data points as missing completely at random. As a result, the estimation of the mean function and the covariance surface might be biased, resulting in a biased estimation of the functional principal components.

View Article and Find Full Text PDF

We conduct an imaging genetics study to explore how effective brain connectivity in the default mode network (DMN) may be related to genetics within the context of Alzheimer's disease and mild cognitive impairment. We develop an analysis of longitudinal resting-state functional magnetic resonance imaging (rs-fMRI) and genetic data obtained from a sample of 111 subjects with a total of 319 rs-fMRI scans from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. A Dynamic Causal Model (DCM) is fit to the rs-fMRI scans to estimate effective brain connectivity within the DMN and related to a set of single nucleotide polymorphisms (SNPs) contained in an empirical disease-constrained set which is obtained out-of-sample from 663 ADNI subjects having only genome-wide data.

View Article and Find Full Text PDF

Background: Research argued that daily excessive concentration hours (DECH) could be more informative through accounting for within-day variations, when assessing population-level exposure to ambient fine particle (PM). However, few studies have comparatively investigated PM-associated risks using DECH and two common metrics of daily mean and hourly peak concentration.

Methods: We collected daily records of all-cause emergency department visits (EDVs) and hourly data on air pollutants and meteorological factors from Shenzhen, China, 2015-2018.

View Article and Find Full Text PDF

We propose a copula-based approach for analyzing functional data with correlated multiple functional outcomes exhibiting heterogeneous shape characteristics. To accommodate the possibly large number of parameters due to having several functional outcomes, parameter estimation is performed in two steps: first, the parameters for the marginal distributions are estimated using the skew t family, and then the dependence structure both within and across outcomes is estimated using a Gaussian copula. We develop an estimation algorithm for the dependence parameters based on the Karhunen-Loève expansion and an EM algorithm that significantly reduces the dimension of the problem and is computationally efficient.

View Article and Find Full Text PDF

When analyzing large multicenter databases, the effects of multiple confounding covariates increase the variability in the data and may reduce the ability to detect changes due to the actual effect of interest, for example, changes due to disease. Efficient ways to evaluate the effect of covariates toward the data harmonization are therefore important. In this article, we showcase techniques to assess the "goodness of harmonization" of covariates.

View Article and Find Full Text PDF

This article is motivated by jointly modelling longitudinal and time-to-event clinical data of patients with diabetes and end-stage renal disease. All patients are on the waiting list for the pancreas transplant after kidney transplant, and some of them have a pancreas transplant before kidney transplant failure or death. Scant literature has studied the dynamical joint relationship of the estimated glomerular filtration rates trajectory, the effect of pancreas transplant, and time-to-event outcomes, although it remains an important clinical question.

View Article and Find Full Text PDF

Speech perception involves multiple input modalities. Research has indicated that perceivers establish cross-modal associations between auditory and visuospatial events to aid perception. Such intermodal relations can be particularly beneficial for speech development and learning, where infants and non-native perceivers need additional resources to acquire and process new sounds.

View Article and Find Full Text PDF