Publications by authors named "Andrew Fitzgibbon"

People are able to keep track of objects as they navigate through space, even when objects are out of sight. This requires some kind of representation of the scene and of the observer's location but the form this might take is debated. We tested the accuracy and reliability of observers' estimates of the visual direction of previously-viewed targets.

View Article and Find Full Text PDF

There is good evidence that simple animals, such as bees, use view-based strategies to return to a familiar location, whereas humans might use a 3-D reconstruction to achieve the same goal. Assuming some noise in the storage and retrieval process, these two types of strategy give rise to different patterns of predicted errors in homing. We describe an experiment that can help distinguish between these models.

View Article and Find Full Text PDF

Traditional Web search engines do not use the images in the HTML pages to find relevant documents for a given query. Instead, they typically operate by computing a measure of agreement between the keywords provided by the user and only the text portion of each page. In this paper we study whether the content of the pictures appearing in a Web page can be used to enrich the semantic description of an HTML document and consequently boost the performance of a keyword-based search engine.

View Article and Find Full Text PDF

We introduce a machine learning approach to demosaicing, the reconstruction of color images from incomplete color filter array samples. There are two challenges to overcome by a demosaicing method: 1) it needs to model and respect the statistics of natural images in order to reconstruct natural looking images and 2) it should be able to perform well in the presence of noise. To facilitate an objective assessment of current methods, we introduce a public ground truth data set of natural images suitable for research in image demosaicing and denoising.

View Article and Find Full Text PDF

Kinectrack is a novel approach to six-DoF tracking that provides agile real-time pose estimation using only commodity hardware. The dot pattern emitter and IR camera components of the standard Kinect device are separated to allow the emitter to roam freely relative to a fixed camera. The six-DoF pose of the emitter component is recovered by matching the dense dot pattern observed by the camera to a pre-captured reference image.

View Article and Find Full Text PDF

We describe two new approaches to human pose estimation. Both can quickly and accurately predict the 3D positions of body joints from a single depth image without using any temporal information. The key to both approaches is the use of a large, realistic, and highly varied synthetic set of training images.

View Article and Find Full Text PDF

It is often assumed that humans generate a 3D reconstruction of the environment, either in egocentric or world-based coordinates, but the steps involved are unknown. Here, we propose two reconstruction-based models, evaluated using data from two tasks in immersive virtual reality. We model the observer's prediction of landmark location based on standard photogrammetric methods and then combine location predictions to compute likelihood maps of navigation behaviour.

View Article and Find Full Text PDF

3D morphable models are low-dimensional parameterizations of 3D object classes which provide a powerful means of associating 3D geometry to 2D images. However, morphable models are currently generated from 3D scans, so for general object classes such as animals they are economically and practically infeasible. We show that, given a small amount of user interaction (little more than that required to build a conventional morphable model), there is enough information in a collection of 2D pictures of certain object classes to generate a full 3D morphable model, even in the absence of surface texture.

View Article and Find Full Text PDF

Accurate calibration of a head mounted display (HMD) is essential both for research on the visual system and for realistic interaction with virtual objects. Yet, existing calibration methods are time consuming and depend on human judgements, making them error prone, and are often limited to optical see-through HMDs. Building on our existing approach to HMD calibration Gilson et al.

View Article and Find Full Text PDF

Second-order priors on the smoothness of 3D surfaces are a better model of typical scenes than first-order priors. However, stereo reconstruction using global inference algorithms, such as graph cuts, has not been able to incorporate second-order priors because the triple cliques needed to express them yield intractable (nonsubmodular) optimization problems. This paper shows that inference with triple cliques can be effectively performed.

View Article and Find Full Text PDF

We present here a method for calibrating an optical see-through head-mounted display (HMD) using techniques usually applied to camera calibration (photogrammetry). Using a camera placed inside the HMD to take pictures simultaneously of a tracked object and features in the HMD display, we could exploit established camera calibration techniques to recover both the intrinsic and extrinsic properties of the HMD (width, height, focal length, optic centre and principal ray of the display). Our method gives low re-projection errors and, unlike existing methods, involves no time-consuming and error-prone human measurements, nor any prior estimates about the HMD geometry.

View Article and Find Full Text PDF

As we move through the world, our eyes acquire a sequence of images. The information from this sequence is sufficient to determine the structure of a three-dimensional scene, up to a scale factor determined by the distance that the eyes have moved. Previous evidence shows that the human visual system accounts for the distance the observer has walked and the separation of the eyes when judging the scale, shape, and distance of objects.

View Article and Find Full Text PDF

An increasing number of neuroscience experiments are using virtual reality to provide a more immersive and less artificial experimental environment. This is particularly useful to navigation and three-dimensional scene perception experiments. Such experiments require accurate real-time tracking of the observer's head in order to render the virtual scene.

View Article and Find Full Text PDF