Publications by authors named "Johannes Burge"

Estimating the motion of objects in depth is important for behavior, and is strongly supported by binocular visual cues. To understand both how the brain should estimate motion in depth and how natural constraints shape and limit performance in two local 3D motion tasks, we develop image-computable ideal observers from a large number of binocular video clips created from a dataset of natural images. The observers spatio-temporally filter the videos, and non-linearly decode 3D motion from the filter responses.

View Article and Find Full Text PDF

Vision science and visual neuroscience seek to understand how stimulus and sensor properties limit the precision with which behaviorally-relevant latent variables are encoded and decoded. In the primate visual system, binocular disparity-the canonical cue for stereo-depth perception-is initially encoded by a set of binocular receptive fields with a range of spatial frequency preferences. Here, with a stereo-image database having ground-truth disparity information at each pixel, we examine how response normalization and receptive field properties determine the fidelity with which binocular disparity is encoded in natural scenes.

View Article and Find Full Text PDF

Image differences between the eyes can cause interocular discrepancies in the speed of visual processing. Millisecond-scale differences in visual processing speed can cause dramatic misperceptions of the depth and three-dimensional direction of moving objects. Here, we develop a monocular and binocular continuous target-tracking psychophysics paradigm that can quantify such tiny differences in visual processing speed.

View Article and Find Full Text PDF

The temporal dynamics of visual information processing varies with the stimulus being processed and with the retinal location that initiates the processing. Here, we present psychophysical data with sub-millisecond precision showing that increasing eccentricity decreases the delay with which stimuli are processed. We show that, even within the central +/-6° of the visual field, processing delays change by a factor of up to three times.

View Article and Find Full Text PDF

Temporal differences in visual information processing between the eyes can cause dramatic misperceptions of motion and depth. Processing delays between the eyes cause the Pulfrich effect: oscillating targets in the frontal plane are misperceived as moving along near-elliptical motion trajectories in depth (Pulfrich, 1922). Here, we explain a previously reported but poorly understood variant: the anomalous Pulfrich effect.

View Article and Find Full Text PDF

Psychology and philosophy have long reflected on the role of perspective in vision. Since the dawn of modern vision science-roughly, since Helmholtz in the late 1800s-scientific explanations in vision have focused on understanding the computations that transform the sensed retinal image into percepts of the three-dimensional environment. The standard view in the science is that distal properties-viewpoint-independent properties of the environment (object shape) and viewpoint-dependent relational properties (3D orientation relative to the viewer)-are perceptually represented and that properties of the proximal stimulus (in vision, the retinal image) are not.

View Article and Find Full Text PDF

Binocular stereo cues are important for discriminating 3D surface orientation, especially at near distances. We devised a single-interval task where observers discriminated the slant of a densely textured planar test surface relative to a textured planar surround reference surface. Although surfaces were rendered with correct perspective, the stimuli were designed so that the binocular cues dominated performance.

View Article and Find Full Text PDF

A goal of visual perception is to provide stable representations of task-relevant scene properties (e.g. object reflectance) despite variation in task-irrelevant scene properties (e.

View Article and Find Full Text PDF

Interocular differences in image blur can cause processing speed differences that lead to dramatic misperceptions of the distance and three-dimensional direction of moving objects. This recently discovered illusion-the reverse Pulfrich effect-is caused by optical conditions induced by monovision, a common correction for presbyopia. Fortunately, anti-Pulfrich monovision corrections, which darken the blurring lens, can eliminate the illusion for many viewing conditions.

View Article and Find Full Text PDF

Binocular fusion relies on matching points in the two eyes that correspond to the same physical feature in the world; however, not all world features are binocularly visible. Near depth edges, some regions of a scene are often visible to only one eye (so-called half occlusions). Accurate detection of these monocularly visible regions is likely to be important for stable visual perception.

View Article and Find Full Text PDF

An ideal observer is a theoretical model observer that performs a specific sensory-perceptual task optimally, making the best possible use of the available information given physical and biological constraints. An image-computable ideal observer (pixels in, estimates out) is a particularly powerful type of ideal observer that explicitly models the flow of visual information from the stimulus-encoding process to the eventual decoding of a sensory-perceptual estimate. Image-computable ideal observer analyses underlie some of the most important results in vision science.

View Article and Find Full Text PDF

Visual systems estimate the three-dimensional (3D) structure of scenes from information in two-dimensional (2D) retinal images. Visual systems use multiple sources of information to improve the accuracy of these estimates, including statistical knowledge of the probable spatial arrangements of natural scenes. Here, we examine how 3D surface tilts are spatially related in real-world scenes, and show that humans pool information across space when estimating surface tilt in accordance with these spatial relationships.

View Article and Find Full Text PDF

A core goal of visual neuroscience is to predict human perceptual performance from natural signals. Performance in any natural task can be limited by at least three sources of uncertainty: stimulus variability, internal noise, and suboptimal computations. Determining the relative importance of these factors has been a focus of interest for decades but requires methods for predicting the fundamental limits imposed by stimulus variability on sensory-perceptual precision.

View Article and Find Full Text PDF

To model the responses of neurons in the early visual system, at least three basic components are required: a receptive field, a normalization term, and a specification of encoding noise. Here, we examine how the receptive field, the normalization factor, and the encoding noise affect the drive to model-neuron responses when stimulated with natural images. We show that when these components are modeled appropriately, the response drives elicited by natural stimuli are Gaussian-distributed and scale invariant, and very nearly maximize the sensitivity (d') for natural-image discrimination.

View Article and Find Full Text PDF

Monovision is a common prescription lens correction for presbyopia [1]. Each eye is corrected for a different distance, causing one image to be blurrier than the other. Millions of people have monovision corrections, but little is known about how interocular blur differences affect motion perception.

View Article and Find Full Text PDF

The human visual system supports stable percepts of object color even though the light that reflects from object surfaces varies significantly with the scene illumination. To understand the computations that support stable color perception, we study how estimating a target object's luminous reflectance factor (LRF; a measure of the light reflected from the object under a standard illuminant) depends on variation in key properties of naturalistic scenes. Specifically, we study how variation in target object reflectance, illumination spectra, and the reflectance of background objects in a scene impact estimation of a target object's LRF.

View Article and Find Full Text PDF

Local depth variation is a distinctive property of natural scenes, but its effects on perception have only recently begun to be investigated. Depth variation in natural scenes is due to depth edges between objects and surface nonuniformities within objects. Here, we demonstrate how natural depth variation impacts performance in two fundamental tasks related to stereopsis: half-occlusion detection and disparity detection.

View Article and Find Full Text PDF

Estimating local surface orientation (slant and tilt) is fundamental to recovering the three-dimensional structure of the environment. It is unknown how well humans perform this task in natural scenes. Here, with a database of natural stereo-images having groundtruth surface orientation at each pixel, we find dramatic differences in human tilt estimation with natural and artificial stimuli.

View Article and Find Full Text PDF

Understanding how nervous systems exploit task-relevant properties of sensory stimuli to perform natural tasks is fundamental to the study of perceptual systems. However, there are few formal methods for determining which stimulus properties are most useful for a given natural task. As a consequence, it is difficult to develop principled models for how to compute task-relevant latent variables from natural signals, and it is difficult to evaluate descriptive models fit to neural response.

View Article and Find Full Text PDF

Accuracy Maximization Analysis (AMA) is a recently developed Bayesian ideal observer method for task-specific dimensionality reduction. Given a training set of proximal stimuli (e.g.

View Article and Find Full Text PDF

Estimating three-dimensional (3D) surface orientation (slant and tilt) is an important first step toward estimating 3D shape. Here, we examine how three local image cues from the same location (disparity gradient, luminance gradient, and dominant texture orientation) should be combined to estimate 3D tilt in natural scenes. We collected a database of natural stereoscopic images with precisely co-registered range images that provide the ground-truth distance at each pixel location.

View Article and Find Full Text PDF

Accurate perception of motion depends critically on accurate estimation of retinal motion speed. Here we first analyse natural image movies to determine the optimal space-time receptive fields (RFs) for encoding local motion speed in a particular direction, given the constraints of the early visual system. Next, from the RF responses to natural stimuli, we determine the neural computations that are optimal for combining and decoding the responses into estimates of speed.

View Article and Find Full Text PDF

The lens system in the human eye is able to best focus light from only one distance at a time.Therefore, many objects in the natural environment are not imaged sharply on the retina. Furthermore, light from objects in the environment is subject to the particular aberrations of the observer's lens system (e.

View Article and Find Full Text PDF

We introduce a novel framework for estimating visual sensitivity using a continuous target-tracking task in concert with a dynamic internal model of human visual performance. Observers used a mouse cursor to track the center of a two-dimensional Gaussian luminance blob as it moved in a random walk in a field of dynamic additive Gaussian luminance noise. To estimate visual sensitivity, we fit a Kalman filter model to the human tracking data under the assumption that humans behave as Bayesian ideal observers.

View Article and Find Full Text PDF

A great challenge of systems neuroscience is to understand the computations that underlie perceptual constancies, the ability to represent behaviorally relevant stimulus properties as constant even when irrelevant stimulus properties vary. As signals proceed through the visual system, neural states become more selective for properties of the environment, and more invariant to irrelevant features of the retinal images. Here, we describe a method for determining the computations that perform these transformations optimally, and apply it to the specific computational task of estimating a powerful depth cue: binocular disparity.

View Article and Find Full Text PDF