We consider a three-layer Sejnowski machine and show that features learnt via contrastive divergence have a dual representation as patterns in a dense associative memory of order P=4. The latter is known to be able to Hebbian store an amount of patterns scaling as N^{P-1}, where N denotes the number of constituting binary neurons interacting P wisely. We also prove that, by keeping the dense associative network far from the saturation regime (namely, allowing for a number of patterns scaling only linearly with N, while P>2) such a system is able to perform pattern recognition far below the standard signal-to-noise threshold. In particular, a network with P=4 is able to retrieve information whose intensity is O(1) even in the presence of a noise O(sqrt[N]) in the large N limit. This striking skill stems from a redundancy representation of patterns-which is afforded given the (relatively) low-load information storage-and it contributes to explain the impressive abilities in pattern recognition exhibited by new-generation neural networks. The whole theory is developed rigorously, at the replica symmetric level of approximation, and corroborated by signal-to-noise analysis and Monte Carlo simulations.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1103/PhysRevLett.124.028301 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!