Finite-Sample Equivalence in Statistical Models for Presence-Only Data.

Ann Appl Stat

Department of Statistics, Stanford University, 390 Serra Mall, Stanford, California 94305-4065, USA.

Published: December 2013

AI Article Synopsis

  • Statistical modeling of presence-only data has become popular in ecology, with methods like inhomogeneous Poisson process (IPP), maximum entropy (Maxent), and logistic regression offering different approaches.
  • The IPP intensity function provides a more meaningful inference metric for presence-only studies compared to occurrence probability, emphasizing that relative intensity can be estimated, but not absolute intensity.
  • The study shows that while IPP and Maxent yield the same density estimates, logistic regression often does not, particularly under model misspecification, leading to the introduction of "infinitely weighted logistic regression" as an equivalent to IPP in finite sample contexts.

Article Abstract

Statistical modeling of presence-only data has attracted much recent attention in the ecological literature, leading to a proliferation of methods, including the inhomogeneous Poisson process (IPP) model, maximum entropy (Maxent) modeling of species distributions and logistic regression models. Several recent articles have shown the close relationships between these methods. We explain why the IPP intensity function is a more natural object of inference in presence-only studies than occurrence probability (which is only defined with reference to quadrat size), and why presence-only data only allows estimation of relative, and not absolute intensity of species occurrence. All three of the above techniques amount to parametric density estimation under the same exponential family model (in the case of the IPP, the fitted density is multiplied by the number of presence records to obtain a fitted intensity). We show that IPP and Maxent give the exact same estimate for this density, but logistic regression in general yields a different estimate in finite samples. When the model is misspecified-as it practically always is-logistic regression and the IPP may have substantially different asymptotic limits with large data sets. We propose "infinitely weighted logistic regression," which is exactly equivalent to the IPP in finite samples. Consequently, many already-implemented methods extending logistic regression can also extend the Maxent and IPP models in directly analogous ways using this technique.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4258396PMC
http://dx.doi.org/10.1214/13-AOAS667DOI Listing

Publication Analysis

Top Keywords

presence-only data
12
logistic regression
12
finite samples
8
ipp
7
finite-sample equivalence
4
equivalence statistical
4
statistical models
4
presence-only
4
models presence-only
4
data
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!