Wiley Interdiscip Rev Comput Stat
January 2024
A biomarker is a measurable indicator of the severity or presence of a disease or medical condition in biomedical or epidemiological research. Biomarkers may help in early diagnosis and prevention of diseases. Several biomarkers have been identified for many diseases such as carbohydrate antigen 19-9 for pancreatic cancer.
View Article and Find Full Text PDFIn regression modelling, measurement error models are often needed to correct for uncertainty arising from measurements of covariates/predictor variables. The literature on measurement error (or errors-in-variables) modelling is plentiful, however, general algorithms and software for maximum likelihood estimation of models with measurement error are not as readily available, in a form that they can be used by applied researchers without relatively advanced statistical expertise. In this study, we develop a novel algorithm for measurement error modelling, which could in principle take any regression model fitted by maximum likelihood, or penalised likelihood, and extend it to account for uncertainty in covariates.
View Article and Find Full Text PDFThe continuing circulation and reassortment with low-pathogenicity avian influenza Gs/Gd (goose/Guangdong/1996)-like avian influenza viruses (AIVs) has caused huge economic losses and raised public health concerns over the zoonotic potential. Virologic surveillance of wild birds has been suggested as part of a global AIV surveillance system. However, underreporting and biased selection of sampling sites has rendered gaining information about the transmission and evolution of highly pathogenic AIV problematic.
View Article and Find Full Text PDFThe time taken to detect a species during site occupancy surveys contains information about the observation process. Accounting for the observation process leads to better inference about site occupancy. We explore the gain in efficiency that can be obtained from time-to-detection (TTD) data and show that this model type has a significant benefit for estimating the parameters related to detection intensity.
View Article and Find Full Text PDFJ Agric Biol Environ Stat
June 2022
Population size estimation is an important research field in biological sciences. In practice, covariates are often measured upon capture on individuals sampled from the population. However, some biological measurements, such as body weight may vary over time within a subject's capture history.
View Article and Find Full Text PDFSpatial or temporal clustering commonly arises in various biological and ecological applications, for example, species or communities may cluster in groups. In this paper, we develop a new clustered occurrence data model where presence-absence data are modeled under a multivariate negative binomial framework. We account for spatial or temporal clustering by introducing a community parameter in the model that controls the strength of dependence between observations thereby enhancing the estimation of the mean and dispersion parameters.
View Article and Find Full Text PDFZero-truncated data arises in various disciplines where counts are observed but the zero count category cannot be observed during sampling. Maximum likelihood estimation can be used to model these data; however, due to its nonstandard form it cannot be easily implemented using well-known software packages, and additional programming is often required. Motivated by the Rao-Blackwell theorem, we develop a weighted partial likelihood approach to estimate model parameters for zero-truncated binomial and Poisson data.
View Article and Find Full Text PDFThe negative binomial distribution is a common model for the analysis of count data in biology and ecology. In many applications, we may not observe the complete frequency count in a quadrat but only that a species occurred in the quadrat. If only occurrence data are available then the two parameters of the negative binomial distribution, the aggregation index and the mean, are not identifiable.
View Article and Find Full Text PDFLongitudinal covariates in survival models are generally analyzed using random effects models. By framing the estimation of these survival models as a functional measurement error problem, semiparametric approaches such as the conditional score or the corrected score can be applied to find consistent estimators for survival model parameters without distributional assumptions on the random effects. However, in order to satisfy the standard assumptions of a survival model, the semiparametric methods in the literature only use covariate data before each event time.
View Article and Find Full Text PDFIndividual covariates are commonly used in capture-recapture models as they can provide important information for population size estimation. However, in practice, one or more covariates may be missing at random for some individuals, which can lead to unreliable inference if records with missing data are treated as missing completely at random. We show that, in general, such a naive complete-case analysis in closed capture-recapture models with some covariates missing at random underestimates the population size.
View Article and Find Full Text PDFGood-Turing frequency estimation (Good, ) is a simple, effective method for predicting detection probabilities of objects of both observed and unobserved classes based on observed frequencies of classes in a sample. The method has been used widely in several disciplines, such as information retrieval, computational linguistics, text recognition, and ecological diversity estimation. Nevertheless, existing studies assume sampling with replacement or sampling from an infinite population, which might be inappropriate for many practical applications.
View Article and Find Full Text PDFBackground: Estimating assemblage species or class richness from samples remains a challenging, but essential, goal. Though a variety of statistical tools for estimating species or class richness have been developed, they are all singly-bounded: assuming only a lower bound of species or classes. Nevertheless there are numerous situations, particularly in the cultural realm, where the maximum number of classes is fixed.
View Article and Find Full Text PDFAssessing species survival status is an essential component of conservation programs. We devised a new statistical method for estimating the probability of species persistence from the temporal sequence of collection dates of museum specimens. To complement this approach, we developed quantitative stopping rules for terminating the search for missing or allegedly extinct species.
View Article and Find Full Text PDFIn practice, when analyzing data from a capture-recapture experiment it is tempting to apply modern advanced statistical methods to the observed capture histories. However, unless the analysis takes into account that the data have only been collected from individuals who have been captured at least once, the results may be biased. Without the development of new software packages, methods such as generalized additive models, generalized linear mixed models, and simulation-extrapolation cannot be readily implemented.
View Article and Find Full Text PDFMeasurement errors in covariates may result in biased estimates in regression analysis. Most methods to correct this bias assume nondifferential measurement errors-i.e.
View Article and Find Full Text PDFBackground: Anti-smoking advertisements are an effective population-based smoking reduction strategy. The Quitline telephone service provides a first point of contact for adults considering quitting. Because of data complexity, the relationship between anti-smoking advertising placement, intensity, and time trends in total call volume is poorly understood.
View Article and Find Full Text PDFMany well-known methods are available for estimating the number of species in a forest community. However, most existing methods result in considerable negative bias in applications, where field surveys typically represent only a small fraction of sampled communities. This article develops a new method based on sampling with replacement to estimate species richness via the generalized jackknife procedure.
View Article and Find Full Text PDFWe consider estimation problems in capture-recapture models when the covariates or the auxiliary variables are measured with errors. The naive approach, which ignores measurement errors, is found to be unacceptable in the estimation of both regression parameters and population size: it yields estimators with biases increasing with the magnitude of errors, and flawed confidence intervals. To account for measurement errors, we derive a regression parameter estimator using a regression calibration method.
View Article and Find Full Text PDF