DNA methylation (DNAm) is a chemical modification of DNA that can be influenced by various factors, including age, environment, and lifestyle. An epigenetic clock is a predictive tool that measures biological age based on DNAm levels. It can provide insights into an individual's biological age, which may differ from their chronological age.
View Article and Find Full Text PDFData sharing barriers present paramount challenges arising from multicenter clinical studies where multiple data sources are stored and managed in a distributed fashion at different local study sites. Merging such data sources into a common data storage for a centralized statistical analysis requires a data use agreement, which is often time-consuming. Data merging may become more burdensome when propensity score modeling is involved in the analysis because combining many confounding variables, and systematic incorporation of this additional modeling in a meta-analysis has not been thoroughly investigated in the literature.
View Article and Find Full Text PDFBackground: Iron and vitamin D deficiencies have been implicated in sleep disturbance. Although females are more susceptible to these deficiencies and frequently report sleep-related issues, few studies have examined these associations in females.
Objective: This study investigates the association of iron and vitamin D deficiencies on sleep in a nationally representative sample of females of reproductive age.
This paper develops an incremental learning algorithm based on quadratic inference function (QIF) to analyze streaming datasets with correlated outcomes such as longitudinal data and clustered data. We propose a renewable QIF (RenewQIF) method within a paradigm of renewable estimation and incremental inference, in which parameter estimates are recursively renewed with current data and summary statistics of historical data, but with no use of any historical subject-level raw data. We compare our renewable estimation method with both offline QIF and offline generalized estimating equations (GEE) approach that process the entire cumulative subject-level data all together, and show theoretically and numerically that our renewable procedure enjoys statistical and computational efficiency.
View Article and Find Full Text PDFThis paper is motivated by a regression analysis of electroencephalography (EEG) neuroimaging data with high-dimensional correlated responses with multi-level nested correlations. We develop a divide-and-conquer procedure implemented in a fully distributed and parallelized computational scheme for statistical estimation and inference of regression parameters. Despite significant efforts in the literature, the computational bottleneck associated with high-dimensional likelihoods prevents the scalability of existing methods.
View Article and Find Full Text PDFEpigenetic modifications, such as DNA methylation, influence gene expression and cardiometabolic phenotypes that are manifest in developmental periods in later life, including adolescence. Untargeted metabolomics analysis provide a comprehensive snapshot of physiological processes and metabolism and have been related to DNA methylation in adults, offering insights into the regulatory networks that influence cellular processes. We analyzed the cross-sectional correlation of blood leukocyte DNA methylation with 3758 serum metabolite features (574 of which are identifiable) in 238 children (ages 8-14 years) from the Early Life Exposures in Mexico to Environmental Toxicants (ELEMENT) study.
View Article and Find Full Text PDFTo classify the association between the maternal lipidome and DNA methylation in cord blood leukocytes. Untargeted lipidomics was performed on first trimester maternal plasma (M1) and delivery maternal plasma (M3) in 100 mothers from the Michigan Mother-Infant Pairs cohort. Cord blood leukocyte DNA methylation was profiled using the Infinium EPIC bead array and empirical Bayes modeling identified differential DNA methylation related to maternal lipid groups.
View Article and Find Full Text PDFWe propose a distributed method for simultaneous inference for datasets with sample size much larger than the number of covariates, i.e., ≫ , in the generalized linear models framework.
View Article and Find Full Text PDFMulti-compartment models have been playing a central role in modelling infectious disease dynamics since the early 20th century. They are a class of mathematical models widely used for describing the mechanism of an evolving epidemic. Integrated with certain sampling schemes, such mechanistic models can be applied to analyse public health surveillance data, such as assessing the effectiveness of preventive measures (e.
View Article and Find Full Text PDFAs proof of concept, we simulate a revised kidney allocation system that includes deceased donor (DD) kidneys as chain-initiating kidneys (DD-CIK) in a kidney paired donation pool (KPDP), and estimate potential increases in number of transplants. We consider chains of length 2 in which the DD-CIK gives to a candidate in the KPDP, and that candidate's incompatible donor donates to theDD waitlist. In simulations, we vary initial pool size, arrival rates of candidate/donor pairs and (living) nondirected donors (NDDs), and delay time from entry to the KPDP until a candidate is eligible to receive a DD-CIK.
View Article and Find Full Text PDFStratification is a very commonly used approach in biomedical studies to handle sample heterogeneity arising from, for examples, clinical units, patient subgroups, or missing-data. A key rationale behind such approach is to overcome potential sampling biases in statistical inference. Two issues of such stratification-based strategy are (i) whether individual strata are sufficiently distinctive to warrant stratification, and (ii) sample size attrition resulted from the stratification may potentially lead to loss of statistical power.
View Article and Find Full Text PDFDirected acyclic mixed graphs (DAMGs) provide a useful representation of network topology with both directed and undirected edges subject to the restriction of no directed cycles in the graph. This graphical framework may arise in many biomedical studies, for example, when a directed acyclic graph (DAG) of interest is contaminated with undirected edges induced by some unobserved confounding factors (eg, unmeasured environmental factors). Directed edges in a DAG are widely used to evaluate causal relationships among variables in a network, but detecting them is challenging when the underlying causality is obscured by some shared latent factors.
View Article and Find Full Text PDFBackground And Objectives: The aim in kidney paired donation (KPD) is typically to maximize the number of transplants achieved through the exchange of donors in a pool comprising incompatible donor-candidate pairs and non-directed (or altruistic) donors. With many possible options in a KPD pool at any given time, the most appropriate set of exchanges cannot be determined by simple inspection. In practice, computer algorithms are used to determine the optimal set of exchanges to pursue.
View Article and Find Full Text PDFThe linear mixed-effects model (LMM) is widely used in the analysis of clustered or longitudinal data. This paper aims to address analytic challenges arising from estimation and selection in the application of the LMM to high-dimensional longitudinal data. We develop a doubly regularized approach in the LMM to simultaneously select fixed and random effects.
View Article and Find Full Text PDFIn kidney paired donation (KPD), incompatible donor-candidate pairs and non-directed (also known as altruistic) donors are pooled together with the aim of maximizing the total utility of transplants realized via donor exchanges. We consider a setting in which disjoint sets of potential transplants are selected at regular intervals, with fallback options available within each proposed set in the case of individual donor, candidate or match failure. We develop methods for calculating the expected utility for such sets under a realistic probability model for the KPD.
View Article and Find Full Text PDFChildhood diet has been implicated in timing of sexual maturation. A key limitation of published studies is the focus on individual foods rather than patterns. We hypothesized that dietary patterns characterized by fruits and vegetables during early childhood (age 3 years) would be associated with delayed pubertal timing, whereas energy-dense and meat-based dietary patterns would relate to earlier puberty.
View Article and Find Full Text PDFIdentifying novel biomarkers to predict renal graft survival is important in post-transplant clinical practice. Serum creatinine, currently the most popular surrogate biomarker, offers limited information of the underlying allograft profiles. It is known to perform unsatisfactorily to predict renal function.
View Article and Find Full Text PDFWhile there is a growing need for kidney transplants to treat end stage kidney disease, the supply of transplantable kidneys is in serious shortage. Kidney paired donation (KPD) programs serve as platforms for candidates with willing but incompatible donors to assess the possibility of exchanging donors, thus opening up new transplant opportunities for these candidates. In recent years, non-directed (or altruistic) donors (NDDs) have been incorporated into KPD programs beginning chains of transplants that benefit many candidates.
View Article and Find Full Text PDFBackground And Objectives: Outcomes for transplants from living unrelated donors are of particular interest in kidney paired donation (KPD) programs where exchanges can be arranged between incompatible donor-recipient pairs or chains created from nondirected/altruistic donors.
Design, Setting, Participants, & Measurements: Using Scientific Registry of Transplant Recipients data, we analyzed 232,705 recipients of kidney-alone transplants from 1998 to 2012. Graft failure rates were estimated using Cox models for recipients of kidney transplants from living unrelated, living related, and deceased donors.
The multivariate regression model is a useful tool to explore complex associations between two kinds of molecular markers, which enables the understanding of the biological pathways underlying disease etiology. For a set of correlated response variables, accounting for such dependency can increase statistical power. Motivated by integrative genomic data analyses, we propose a new methodology-sparse multivariate factor analysis regression model (smFARM), in which correlations of response variables are assumed to follow a factor analysis model with latent factors.
View Article and Find Full Text PDFThis paper concerns regression methodology for assessing relationships between multi-dimensional response variables and covariates that are correlated within a network. To address analytical challenges associated with the integration of network topology into the regression analysis, we propose a hybrid quadratic inference method that uses both prior and data-driven correlations among network nodes. A Godambe information-based tuning strategy is developed to allocate weights between the prior and data-driven network structures, so the estimator is efficient.
View Article and Find Full Text PDFCombining multiple studies is frequently undertaken in biomedical research to increase sample sizes for statistical power improvement. We consider the marginal model for the regression analysis of repeated measurements collected in several similar studies with potentially different variances and correlation structures. It is of great importance to examine whether there exist common parameters across study-specific marginal models so that simpler models, sensible interpretations, and meaningful efficiency gain can be obtained.
View Article and Find Full Text PDFMerging multiple datasets collected from studies with identical or similar scientific objectives is often undertaken in practice to increase statistical power. This article concerns the development of an effective statistical method that enables to merge multiple longitudinal datasets subject to various heterogeneous characteristics, such as different follow-up schedules and study-specific missing covariates (e.g.
View Article and Find Full Text PDF