Regression analyses based on transformations of cumulative incidence functions are often adopted when modeling and testing for treatment effects in clinical trial settings involving competing and semi-competing risks. Common frameworks include the Fine-Gray model and models based on direct binomial regression. Using large sample theory we derive the limiting values of treatment effect estimators based on such models when the data are generated according to multiplicative intensity-based models, and show that the estimand is sensitive to several process features.
View Article and Find Full Text PDFIn life history analysis of data from cohort studies, it is important to address the process by which participants are identified and selected. Many health studies select or enrol individuals based on whether they have experienced certain health related events, for example, disease diagnosis or some complication from disease. Standard methods of analysis rely on assumptions concerning the independence of selection and a person's prospective life history process, given their prior history.
View Article and Find Full Text PDFClinical trials with random assignment of treatment provide evidence about causal effects of an experimental treatment compared to standard care. However, when disease processes involve multiple types of possibly semi-competing events, specification of target estimands and causal inferences can be challenging. Intercurrent events such as study withdrawal, the introduction of rescue medication, and death further complicate matters.
View Article and Find Full Text PDFThe HostSeq initiative recruited 10,059 Canadians infected with SARS-CoV-2 between March 2020 and March 2023, obtained clinical information on their disease experience and whole genome sequenced (WGS) their DNA. We analyzed the WGS data for genetic contributors to severe COVID-19 (considering 3,499 hospitalized cases and 4,975 non-hospitalized after quality control). We investigated the evidence for replication of loci reported by the International Host Genetics Initiative (HGI); analyzed the X chromosome; conducted rare variant gene-based analysis and polygenic risk score testing.
View Article and Find Full Text PDFTo advance scientific understanding of disease processes and related intervention effects, study results should be free from bias and replicable. More broadly, investigators seek results that are transportable, that is, applicable to a perceived study population as well as in other environments and populations. We review fundamental statistical issues that arise in the analysis of observational data from disease cohorts and other sources and discuss how these issues affect the transportability and replicability of research results.
View Article and Find Full Text PDFIntensity-based multistate models provide a useful framework for characterizing disease processes, the introduction of interventions, loss to followup, and other complications arising in the conduct of randomized trials studying complex life history processes. Within this framework we discuss the issues involved in the specification of estimands and show the limiting values of common estimators of marginal process features based on cumulative incidence function regression models. When intercurrent events arise we stress the need to carefully define the target estimand and the importance of avoiding targets of inference that are not interpretable in the real world.
View Article and Find Full Text PDFLifetime Data Anal
October 2022
Studies of chronic disease often involve modeling the relationship between marker processes and disease onset or progression. The Cox regression model is perhaps the most common and convenient approach to analysis in this setting. In most cohort studies, however, biospecimens and biomarker values are only measured intermittently (e.
View Article and Find Full Text PDFDuring an epidemic, accurate estimation of the numbers of viral infections in different regions and groups is important for understanding transmission and guiding public health actions. This depends on effective testing strategies that identify a high proportion of infections (that is, provide high ascertainment rates). For the novel coronavirus SARS-CoV-2, ascertainment rates do not appear to be high in most jurisdictions, but quantitative analysis of testing has been limited.
View Article and Find Full Text PDFTests for variance or scale effects due to covariates are used in many areas and recently, in genomic and genetic association studies. We study score tests based on location-scale models with arbitrary error distributions that allow incorporation of additional adjustment covariates. Tests based on Gaussian and Laplacian double generalized linear models are examined in some detail.
View Article and Find Full Text PDFMultistate models provide a powerful framework for the analysis of life history processes when the goal is to characterize transition intensities, transition probabilities, state occupancy probabilities, and covariate effects thereon. Data on such processes are often only available at random visit times occurring over a finite period. We formulate a joint multistate model for the life history process, the recurrent visit process, and a random loss to follow-up time at which the visit process terminates.
View Article and Find Full Text PDFFor regression with covariates missing not at random where the missingness depends on the missing covariate values, complete-case (CC) analysis leads to consistent estimation when the missingness is independent of the response given all covariates, but it may not have the desired level of efficiency. We propose a general empirical likelihood framework to improve estimation efficiency over the CC analysis. We expand on methods in Bartlett et al.
View Article and Find Full Text PDFA framework is proposed for the joint modeling of life history and loss to follow-up (LTF) processes in cohort studies. This framework provides a basis for discussing independence conditions for LTF and censoring and examining the implications of dependent LTF. We consider failure time and more general life history processes.
View Article and Find Full Text PDFFailure time studies based on observational cohorts often have to deal with irregular intermittent observation of individuals, which produces interval-censored failure times. When the observation times depend on factors related to a person's failure time, the failure times may be dependently interval censored. Inverse-intensity-of-visit weighting methods have been developed for irregularly observed longitudinal or repeated measures data and recently extended to parametric failure time analysis.
View Article and Find Full Text PDFEvent history studies based on disease clinic data often face several complications. Specifically, patients may visit the clinic irregularly, and the intermittent observation times could depend on disease-related variables; this can cause a failure time outcome to be dependently interval-censored. We propose a weighted estimating function approach so that dependently interval-censored failure times can be analysed consistently.
View Article and Find Full Text PDFWith the ultimate aim of improving clinical management of breast cancer, investigators have sought to identify molecular genetic markers that stratify newly diagnosed patients into subtypes differing in short- or long-term prognosis. Conventional survival models can fail to describe adequately the relationship between subtype and disease recurrence, particularly when there is a substantial proportion of long-term disease-free survivors. The observed patterns of disease-free survival in an undifferentiated patient cohort may be explained by an underlying mixture of two subgroups: patients who will remain free of disease in the long term (ie, cured), and those who will experience disease recurrence within their lifetime (ie, susceptible.
View Article and Find Full Text PDFLife history studies collect information on events and other outcomes during people's lifetimes. For example, these may be related to childhood development, education, fertility, health, or employment. Such longitudinal studies have constraints on the selection of study members, the duration and frequency of follow-up, and the accuracy and completeness of information obtained.
View Article and Find Full Text PDFSequentially observed survival times are of interest in many studies but there are difficulties in analyzing such data using nonparametric or semiparametric methods. First, when the duration of followup is limited and the times for a given individual are not independent, induced dependent censoring arises for the second and subsequent survival times. Non-identifiability of the marginal survival distributions for second and later times is another issue, since they are observable only if preceding survival times for an individual are uncensored.
View Article and Find Full Text PDFCopula models for multivariate lifetimes have become widely used in areas such as biomedicine, finance and insurance. This paper fills some gaps in existing methodology for copula parameters and model assessment. We consider procedures based on likelihood and pseudolikelihood ratio statistics and introduce semiparametric maximum likelihood estimation leading to semiparametric versions.
View Article and Find Full Text PDFLifetime Data Anal
October 2010
This paper considers settings where populations of units may experience recurrent events, termed failures for convenience, and where the units are subject to varying levels of usage. We provide joint models for the recurrent events and usage processes, which facilitate analysis of their relationship as well as prediction of failures. Data on usage are often incomplete and we show how to implement maximum likelihood estimation in such cases.
View Article and Find Full Text PDFIn many chronic disease processes subjects are at risk of two or more types of events. We describe a bivariate mixed Poisson model in which a copula function is used to model the association between two gamma distributed random effects. The resulting model is a bivariate negative binomial process in which each type of event arises from a negative binomial process.
View Article and Find Full Text PDFWhen statistical models are used to predict the values of unobserved random variables, loss functions are often used to quantify the accuracy of a prediction. The expected loss over some specified set of occasions is called the prediction error. This paper considers the estimation of prediction error when regression models are used to predict survival times and discusses the use of these estimates.
View Article and Find Full Text PDFIn some applications involving regression the values of certain variables are missing by design for some individuals. For example, in two-stage studies (Zhao and Lipsitz, 1992), data on "cheaper" variables are collected on a random sample of individuals in stage I, and then "expensive" variables are measured for a subsample of these in stage II. So the "expensive" variables are missing by design at stage I.
View Article and Find Full Text PDFObjective: To identify processes that influence data collection, particularly in the reporting of deaths in mortality studies, using patient registry data.
Methods: The University of Toronto Psoriatic Arthritis Clinic has mechanisms for patient followup and identification of deaths. Logistic regression was used to identify patient characteristics that discriminate between 2 populations of deaths, those reported under regular followup and those reported in the context of special studies.
Multi-type recurrent event data arise when two or more different kinds of events may occur repeatedly over a period of observation. The scientific objectives in such settings are often to describe features of the marginal processes and to study the association between the different types of events. Interval-censored multi-type recurrent event data arise when the precise event times are unobserved, but intervals are available during which the events are known to have occurred.
View Article and Find Full Text PDFBiometrics
December 2003
This article presents methodology for multivariate proportional hazards (PH) regression models. The methods employ flexible piecewise constant or spline specifications for baseline hazard functions in either marginal or conditional PH models, along with assumptions about the association among lifetimes. Because the models are parametric, ordinary maximum likelihood can be applied; it is able to deal easily with such data features as interval censoring or sequentially observed lifetimes, unlike existing semiparametric methods.
View Article and Find Full Text PDF