A serious obstacle for understanding early spatial vision comes from the failure of the so-called standard model (SM) to predict the perception of plaid masking. But the SM originated from a major oversimplification of single neuron computations, ignoring fundamental properties of dendrites. Here we show that a spatial vision model including computations mimicking the input-dependent nature of dendritic nonlinearities, i.
View Article and Find Full Text PDFOne well-established characteristic of early visual processing is the contrast sensitivity function (CSF) which describes how sensitivity varies with the spatial frequency (SF) content of the visual input. The CSF prompted the development of a now standard model of spatial vision. It represents the visual input by activity in orientation- and SF selective channels which are nonlinearly recombined to predict a perceptual decision.
View Article and Find Full Text PDFThe eye's natural aging influences our ability to focus on close objects. Without optical correction, all adults will suffer from blurry close vision starting in their 40s. In effect, different optical corrections are necessary for near and far vision.
View Article and Find Full Text PDFNeither the hype exemplified in some exaggerated claims about deep neural networks (DNNs), nor the gloom expressed by Bowers et al. do DNNs as models in vision science justice: DNNs rapidly evolve, and today's limitations are often tomorrow's successes. In addition, providing explanations as well as prediction and image-computability are model desiderata; one should not be favoured at the expense of the other.
View Article and Find Full Text PDFIn laboratory object recognition tasks based on undistorted photographs, both adult humans and deep neural networks (DNNs) perform close to ceiling. Unlike adults', whose object recognition performance is robust against a wide range of image distortions, DNNs trained on standard ImageNet (1.3M images) perform poorly on distorted images.
View Article and Find Full Text PDFAnnu Rev Vis Sci
September 2023
Deep neural networks (DNNs) are machine learning algorithms that have revolutionized computer vision due to their remarkable successes in tasks like object classification and segmentation. The success of DNNs as computer vision algorithms has led to the suggestion that DNNs may also be good models of human visual perception. In this article, we review evidence regarding current DNNs as adequate behavioral models of human core object recognition.
View Article and Find Full Text PDFVision researchers are interested in mapping complex physical stimuli to perceptual dimensions. Such a mapping can be constructed using multidimensional psychophysical scaling or ordinal embedding methods. Both methods infer coordinates that agree as much as possible with the observer's judgments so that perceived similarity corresponds with distance in the inferred space.
View Article and Find Full Text PDFColor constancy is our ability to perceive constant colors across varying illuminations. Here, we trained deep neural networks to be color constant and evaluated their performance with varying cues. Inputs to the networks consisted of two-dimensional images of simulated cone excitations derived from three-dimensional (3D) rendered scenes of 2,115 different 3D shapes, with spectral reflectances of 1,600 different Munsell chips, illuminated under 278 different natural illuminations.
View Article and Find Full Text PDFIn this article, we address the problem of measuring and analyzing sensation, the subjective magnitude of one's experience. We do this in the context of the method of triads: The sensation of the stimulus is evaluated via relative judgments of the following form: "Is stimulus \(S_i\) more similar to stimulus \(S_j\) or to stimulus \(S_k\)?" We propose to use ordinal embedding methods from machine learning to estimate the scaling function from the relative judgments. We review two relevant and well-known methods in psychophysics that are partially applicable in our setting: nonmetric multidimensional scaling (NMDS) and the method of maximum likelihood difference scaling (MLDS).
View Article and Find Full Text PDFOne of the most important tasks for humans is the attribution of causes and effects in all wakes of life. The first systematical study of visual perception of causality-often referred to as -was done by Albert Michotte using his now well-known paradigm. Launching events are the seeming collision and seeming transfer of movement between two objects-abstract, featureless stimuli ("objects") in Michotte's original experiments.
View Article and Find Full Text PDFAtten Percept Psychophys
November 2019
We discovered an error in the implementation of the function used to generate radial frequency (RF) distortions in our article (Wallis, Tobias, Bethge, & Wichmann, 2017).
View Article and Find Full Text PDFScene viewing is used to study attentional selection in complex but still controlled environments. One of the main observations on eye movements during scene viewing is the inhomogeneous distribution of fixation locations: While some parts of an image are fixated by almost all observers and are inspected repeatedly by the same observer, other image parts remain unfixated by observers even after long exploration intervals. Here, we apply spatial point process methods to investigate the relationship between pairs of fixations.
View Article and Find Full Text PDFWe subjectively perceive our visual field with high fidelity, yet peripheral distortions can go unnoticed and peripheral objects can be difficult to identify (crowding). Prior work showed that humans could not discriminate images synthesised to match the responses of a mid-level ventral visual stream model when information was averaged in receptive fields with a scaling of about half their retinal eccentricity. This result implicated ventral visual area V2, approximated 'Bouma's Law' of crowding, and has subsequently been interpreted as a link between crowding zones, receptive field scaling, and our perceptual experience.
View Article and Find Full Text PDFBottom-up and top-down as well as low-level and high-level factors influence where we fixate when viewing natural scenes. However, the importance of each of these factors and how they interact remains a matter of debate. Here, we disentangle these factors by analyzing their influence over time.
View Article and Find Full Text PDFWhen searching a target in a natural scene, it has been shown that both the target's visual properties and similarity to the background influence whether and how fast humans are able to find it. So far, it was unclear whether searchers adjust the dynamics of their eye movements (e.g.
View Article and Find Full Text PDFWhen watching the image of a natural scene on a computer screen, observers initially move their eyes toward the center of the image-a reliable experimental finding termed central fixation bias. This systematic tendency in eye guidance likely masks attentional selection driven by image properties and top-down cognitive processes. Here, we show that the central fixation bias can be reduced by delaying the initial saccade relative to image onset.
View Article and Find Full Text PDFA large part of classical visual psychophysics was concerned with the fundamental question of how pattern information is initially encoded in the human visual system. From these studies a relatively standard model of early spatial vision emerged, based on spatial frequency and orientation-specific channels followed by an accelerating nonlinearity and divisive normalization: contrast gain-control. Here we implement such a model in an image-computable way, allowing it to take arbitrary luminance images as input.
View Article and Find Full Text PDFOur visual environment is full of texture-"stuff" like cloth, bark, or gravel as distinct from "things" like dresses, trees, or paths-and humans are adept at perceiving subtle variations in material properties. To investigate image features important for texture perception, we psychophysically compare a recent parametric model of texture appearance (convolutional neural network [CNN] model) that uses the features encoded by a deep CNN (VGG-19) with two other models: the venerable Portilla and Simoncelli model and an extension of the CNN model in which the power spectrum is additionally matched. Observers discriminated model-generated textures from original natural textures in a spatial three-alternative oddity paradigm under two viewing conditions: when test patches were briefly presented to the near-periphery ("parafoveal") and when observers were able to make eye movements to all three patches ("inspection").
View Article and Find Full Text PDFDynamical models of cognition play an increasingly important role in driving theoretical and experimental research in psychology. Therefore, parameter estimation, model analysis and comparison of dynamical models are of essential importance. In this article, we propose a maximum likelihood approach for model analysis in a fully dynamical framework that includes time-ordered experimental data.
View Article and Find Full Text PDFMaximum likelihood difference scaling (MLDS) is a method for the estimation of perceptual scales based on the judgment of differences in stimulus appearance (Maloney & Yang, 2003). MLDS has recently also been used to estimate near-threshold discrimination performance (Devinck & Knoblauch, 2012). Using MLDS as a psychophysical method for sensitivity estimation is potentially appealing, because MLDS has been reported to need less data than forced-choice procedures, and particularly naive observers report to prefer suprathreshold comparisons to JND-style threshold tasks.
View Article and Find Full Text PDFAtten Percept Psychophys
April 2017
When visual features in the periphery are close together they become difficult to recognize: something is present but it is unclear what. This is called "crowding". Here we investigated sensitivity to features in highly familiar shapes (letters) by applying spatial distortions.
View Article and Find Full Text PDFDuring scene perception our eyes generate complex sequences of fixations. Predictors of fixation locations are bottom-up factors such as luminance contrast, top-down factors like viewing instruction, and systematic biases, e.g.
View Article and Find Full Text PDFGestalt psychology is often criticized as lacking quantitative measurements and precise mathematical models. While this is true of the early Gestalt school, today there are many quantitative approaches in Gestalt perception and the special issue of Vision Research "Quantitative Approaches in Gestalt Perception" showcases the current state-of-the-art. In this article we give an overview of these current approaches.
View Article and Find Full Text PDF