Neurophysiological evidence for invariant representations of objects and faces in the primate inferior temporal visual cortex is described. Then a computational approach to how invariant representations are formed in the brain is described that builds on the neurophysiology. A feature hierarchy model in which invariant representations can be built by self-organizing learning based on the temporal and spatial statistics of the visual input produced by objects as they transform in the world is described. VisNet can use temporal continuity in an associative synaptic learning rule with a short-term memory trace, and/or it can use spatial continuity in continuous spatial transformation learning which does not require a temporal trace. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in, for example, spatial and object search tasks. The approach has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene. The approach has also been extended to provide, with an additional layer, for the development of representations of spatial scenes of the type found in the hippocampus.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3378046 | PMC |
http://dx.doi.org/10.3389/fncom.2012.00035 | DOI Listing |
Uncertain Artif Intell
January 2025
Department of Computer Science and Engineering, The Ohio State University, USA.
Although recent advances in machine learning have shown its success to learn from independent and identically distributed (IID) data, it is vulnerable to out-of-distribution (OOD) data in an open world. Domain generalization (DG) deals with such an issue and it aims to learn a model from multiple source domains that can be generalized to unseen target domains. Existing studies on DG have largely focused on stationary settings with homogeneous source domains.
View Article and Find Full Text PDFUnlabelled: Sensory stimuli vary across a variety of dimensions, like contrast, orientation, or texture. The brain must rely on population representations to disentangle changes in one dimension from changes in another. To understand how the visual system might extract separable stimulus representations, we recorded multiunit neuronal responses to texture images varying along two dimensions: contrast, a property represented as early as the retina, and naturalistic statistical structure, a property that modulates neuronal responses in V2 and V4, but not in V1.
View Article and Find Full Text PDFJ Phys Chem Lett
January 2025
Department of Physics, Rutgers University, Newark, New Jersey 07102, United States of America.
Graph Neural Networks (GNNs) have emerged as powerful tools for predicting material properties, yet they often struggle to capture many-body interactions and require extensive manual feature engineering. Here, we present EOSnet (Embedded Overlap Structures for Graph Neural Networks), a novel approach that addresses these limitations by incorporating Gaussian Overlap Matrix (GOM) fingerprints as node features within the GNN architecture. Unlike models that rely on explicit angular terms or human-engineered features, EOSnet efficiently encodes many-body interactions through orbital overlap matrices, providing a rotationally invariant and transferable representation of atomic environments.
View Article and Find Full Text PDFProc SIGCHI Conf Hum Factor Comput Syst
May 2024
Stony Brook University, USA.
Hand gestures provide an alternate interaction modality for blind users and can be supported using commodity smartwatches without requiring specialized sensors. The enabling technology is an accurate gesture recognition algorithm, but almost all algorithms are designed for sighted users. Our study shows that blind user gestures are considerably diferent from sighted users, rendering current recognition algorithms unsuitable.
View Article and Find Full Text PDFBiol Imaging
November 2024
IBENS, Ecole Normale Supérieure PSL, Paris, 75005, France.
Self-supervised representation learning (SSRL) in computer vision relies heavily on simple image transformations such as random rotation, crops, or illumination to learn meaningful and invariant features. Despite acknowledged importance, there is a lack of comprehensive exploration of the impact of transformation choice in the literature. Our study delves into this relationship, specifically focusing on microscopy imaging with subtle cell phenotype differences.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!