Publications by authors named "Jan Melchior"

Article Synopsis
  • The study highlights the issues caused by the derivative of activation functions in artificial neural networks, especially during continual learning, and introduces a new approach called Hebbian descent to address these problems.
  • Hebbian descent uses an alternative loss function that ignores the derivative of the activation function, which helps prevent vanishing error signals in both shallow and deep networks, enhancing training effectiveness.
  • By integrating centering with Hebbian descent, the method not only improves continual learning by reducing catastrophic interference but also shows performance on par with regular gradient descent in specific scenarios.
View Article and Find Full Text PDF

Based on the CRISP theory (Content Representation, Intrinsic Sequences, and Pattern completion), we present a computational model of the hippocampus that allows for online one-shot storage of pattern sequences without the need for a consolidation process. In our model, CA3 provides a pre-trained sequence that is hetero-associated with the input sequence, rather than storing a sequence in CA3. That is, plasticity on a short timescale only occurs in the incoming and outgoing connections of CA3, not in its recurrent connections.

View Article and Find Full Text PDF

Episodic memories have been suggested to be represented by neuronal sequences, which are stored and retrieved from the hippocampal circuit. A special difficulty is that realistic neuronal sequences are strongly correlated with each other since computational memory models generally perform poorly when correlated patterns are stored. Here, we study in a computational model under which conditions the hippocampal circuit can perform this function robustly.

View Article and Find Full Text PDF

We present a theoretical analysis of Gaussian-binary restricted Boltzmann machines (GRBMs) from the perspective of density models. The key aspect of this analysis is to show that GRBMs can be formulated as a constrained mixture of Gaussians, which gives a much better insight into the model's capabilities and limitations. We further show that GRBMs are capable of learning meaningful features without using a regularization term and that the results are comparable to those of independent component analysis.

View Article and Find Full Text PDF