Predictive coding, currently a highly influential theory in neuroscience, has not been widely adopted in machine learning yet. In this work, we transform the seminal model of Rao and Ballard (1999) into a modern deep learning framework while remaining maximally faithful to the original schema. The resulting network we propose (PreCNet) is tested on a widely used next-frame video prediction benchmark, which consists of images from an urban environment recorded from a car-mounted camera, and achieves state-of-the-art performance. Performance on all measures (MSE, PSNR, and SSIM) was further improved when a larger training set (2M images from BDD100k) pointed to the limitations of the KITTI training set. This work demonstrates that an architecture carefully based on a neuroscience model, without being explicitly tailored to the task at hand, can exhibit exceptional performance.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TNNLS.2023.3240857 | DOI Listing |
PLoS Comput Biol
January 2025
Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.
Sensory neurons continually adapt their response characteristics according to recent stimulus history. However, it is unclear how such a reactive process can benefit the organism. Here, we test the hypothesis that adaptation actually acts proactively in the sense that it optimally adjusts sensory encoding for future stimuli.
View Article and Find Full Text PDFNat Comput Sci
December 2024
Department of Neural Dynamics and Magnetoencephalography, Hertie Institute for Clinical Brain Research, University of Tübingen, Tübingen, Germany.
Understanding how visual information is encoded in biological and artificial systems often requires the generation of appropriate stimuli to test specific hypotheses, but available methods for video generation are scarce. Here we introduce the spatiotemporal style transfer (STST) algorithm, a dynamic visual stimulus generation framework that allows the manipulation and synthesis of video stimuli for vision research. We show how stimuli can be generated that match the low-level spatiotemporal features of their natural counterparts, but lack their high-level semantic features, providing a useful tool to study object recognition.
View Article and Find Full Text PDFIEEE Trans Image Process
April 2024
Current semi-supervised video object segmentation (VOS) methods often employ the entire features of one frame to predict object masks and update memory. This introduces significant redundant computations. To reduce redundancy, we introduce a Region Aware Video Object Segmentation (RAVOS) approach, which predicts regions of interest (ROIs) for efficient object segmentation and memory storage.
View Article and Find Full Text PDFEntropy (Basel)
November 2023
Language Intelligence and Information Retrieval (LIIR) Lab, Department of Computer Science KU Leuven, 3001 Leuven, Belgium.
An important challenge in machine learning is performing with accuracy when few training samples are available from the target distribution. If a large number of training samples from a related distribution are available, transfer learning can be used to improve the performance. This paper investigates how to do transfer learning more effectively if the source and target distributions are related through a Sparse Mechanism Shift for the application of next-frame prediction.
View Article and Find Full Text PDFEntropy (Basel)
October 2023
Department of Computer Science, University of California, Irvine, CA 92697, USA.
Denoising diffusion probabilistic models are a promising new class of generative models that mark a milestone in high-quality image generation. This paper showcases their ability to sequentially generate video, surpassing prior methods in perceptual and probabilistic forecasting metrics. We propose an autoregressive, end-to-end optimized video diffusion model inspired by recent advances in neural video compression.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!