The worldwide variation in vegetation height is fundamental to the global carbon cycle and central to the functioning of ecosystems and their biodiversity. Geospatially explicit and, ideally, highly resolved information is required to manage terrestrial ecosystems, mitigate climate change and prevent biodiversity loss. Here we present a comprehensive global canopy height map at 10 m ground sampling distance for the year 2020.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
December 2023
We propose a scheme for supervised image classification that uses privileged information, in the form of keypoint annotations for the training data, to learn strong models from small and/or biased training sets. Our main motivation is the recognition of animal species for ecological applications such as biodiversity modelling, which is challenging because of long-tailed species distributions due to rare species, and strong dataset biases such as repetitive scene background in camera traps. To counteract these challenges, we propose a visual attention mechanism that is supervised via keypoint annotations that highlight important object parts.
View Article and Find Full Text PDFCôte d'Ivoire and Ghana, the world's largest producers of cocoa, account for two thirds of the global cocoa production. In both countries, cocoa is the primary perennial crop, providing income to almost two million farmers. Yet precise maps of the area planted with cocoa are missing, hindering accurate quantification of expansion in protected areas, production and yields and limiting information available for improved sustainability governance.
View Article and Find Full Text PDFFine-grained population maps are needed in several domains, like urban planning, environmental monitoring, public health, and humanitarian operations. Unfortunately, in many countries only aggregate census counts over large spatial units are collected, moreover, these are not always up-to-date. We present POMELO, a deep learning model that employs coarse census counts and open geodata to estimate fine-grained population maps with [Formula: see text]m ground sampling distance.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
August 2022
We propose a new STAckable Recurrent cell (STAR) for recurrent neural networks (RNNs), which has fewer parameters than widely used LSTM [16] and GRU [10] while being more robust against vanishing or exploding gradients. Stacking recurrent units into deep architectures suffers from two major limitations: (i) many recurrent cells (e.g.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
March 2022
The success of monocular depth estimation relies on large and diverse training sets. Due to the challenges associated with acquiring dense ground-truth depth across different environments at scale, a number of datasets with distinct characteristics and biases have emerged. We develop tools that enable mixing multiple datasets during training, even if their annotations are incompatible.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
October 2016
The task of tracking multiple targets is often addressed with the so-called tracking-by-detection paradigm, where the first step is to obtain a set of target hypotheses for each frame independently. Tracking can then be regarded as solving two separate, but tightly coupled problems. The first is to carry out data association, i.
View Article and Find Full Text PDFRecent works on multimodel fitting are often formulated as an energy minimization task, where the energy function includes fitting error and regularization terms, such as low-level spatial smoothness and model complexity. In this paper, we introduce a novel energy with high-level geometric priors that consider interactions between geometric models, such that certain preferred model configurations may be induced.We argue that in many applications, such prior geometric properties are available and should be fruitfully exploited.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
January 2014
Many recent advances in multiple target tracking aim at finding a (nearly) optimal set of trajectories within a temporal window. To handle the large space of possible trajectory hypotheses, it is typically reduced to a finite set by some form of data-driven or regular discretization. In this work, we propose an alternative formulation of multitarget tracking as minimization of a continuous energy.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
November 2013
Geometric 3D reasoning at the level of objects has received renewed attention recently in the context of visual scene understanding. The level of geometric detail, however, is typically limited to qualitative representations or coarse boxes. This is linked to the fact that today's object class detectors are tuned toward robust 2D matching rather than accurate 3D geometry, encouraged by bounding-box-based benchmarks such as Pascal VOC.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
April 2013
Following recent advances in detection, context modeling, and tracking, scene understanding has been the focus of renewed interest in computer vision research. This paper presents a novel probabilistic 3D scene model that integrates state-of-the-art multiclass object detection, object tracking and scene labeling together with geometric 3D reasoning. Our model is able to represent complex object interactions such as inter-object occlusion, physical exclusion between objects, and geometric context.
View Article and Find Full Text PDFGiven the presence of massive feedback loops in brain networks, it is difficult to disentangle the contribution of feedforward and feedback processing to the recognition of visual stimuli, in this case, of emotional body expressions. The aim of the work presented in this letter is to shed light on how well feedforward processing explains rapid categorization of this important class of stimuli. By means of parametric masking, it may be possible to control the contribution of feedback activity in human participants.
View Article and Find Full Text PDFMultibody structure from motion (SfM) is the extension of classical SfM to dynamic scenes with multiple rigidly moving objects. Recent research has unveiled some of the mathematical foundations of the problem, but a practical algorithm which can handle realistic sequences is still missing. In this paper, we discuss the requirements for such an algorithm, highlight theoretical issues and practical problems, and describe how a static structure-from-motion framework needs to be extended to handle real dynamic scenes.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
October 2009
In this paper, we address the problem of multiperson tracking in busy pedestrian zones using a stereo rig mounted on a mobile platform. The complexity of the problem calls for an integrated solution that extracts as much visual information as possible and combines it through cognitive feedback cycles. We propose such an approach, which jointly estimates camera position, stereo depth, object detection, and tracking.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
October 2008
We present a novel approach for multi-object tracking which considers object detection and spacetime trajectory estimation as a coupled optimization problem. Our approach is formulated in a Minimum Description Length hypothesis selection framework, which allows it to recover from mismatches and temporarily lost tracks. Building upon a state-of-the-art object detector, it performs multiview/multicategory object recognition to detect cars and pedestrians in the input images.
View Article and Find Full Text PDFResearch into the visual perception of human emotion has traditionally focused on the facial expression of emotions. Recently researchers have turned to the more challenging field of emotional body language, i.e.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
September 2007
We propose a similarity measure based on a Spatial-color Mixture of Gaussians (SMOG) appearance model for particle filters. This improves on the popular similarity measure based on color histograms because it considers not only the colors in a region but also the spatial layout of the colors. Hence, the SMOG-based similarity measure is more discriminative.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
June 2006
Multibody structure-and-motion (MSaM) is the problem to establish the multiple-view geometry of several views of a 3D scene taken at different times, where the scene consists of multiple rigid objects moving relative to each other. We examine the case of two views. The setting is the following: Given are a set of corresponding image points in two images, which originate from an unknown number of moving scene objects, each giving rise to a motion model.
View Article and Find Full Text PDF