Publications by R Aiudi

Publications by authors named "R Aiudi"

Page 1 of 1

Local kernel renormalization as a mechanism for feature learning in overparametrized convolutional neural networks.

R Aiudi R Pacelli P Baglioni A Vezzani R Burioni

Nat Commun

January 2025

Empirical evidence shows that fully-connected neural networks in the infinite-width limit (lazy training) eventually outperform their finite-width counterparts in most computer vision tasks; on the other hand, modern architectures with convolutional layers often achieve optimal performances in the finite-width regime. In this work, we present a theoretical framework that provides a rationale for these differences in one-hidden-layer networks; we derive an effective action in the so-called proportional limit for an architecture with one convolutional hidden layer and compare it with the result available for fully-connected networks. Remarkably, we identify a completely different form of kernel renormalization: whereas the kernel of the fully-connected architecture is just globally renormalized by a single scalar parameter, the convolutional kernel undergoes a local renormalization, meaning that the network can select the local components that will contribute to the final prediction in a data-dependent way.

View Article and Find Full Text PDF

Predictive Power of a Bayesian Effective Action for Fully Connected One Hidden Layer Neural Networks in the Proportional Limit.

P Baglioni R Pacelli R Aiudi F Di Renzo A Vezzani

Phys Rev Lett

July 2024

We perform accurate numerical experiments with fully connected one hidden layer neural networks trained with a discretized Langevin dynamics on the MNIST and CIFAR10 datasets. Our goal is to empirically determine the regimes of validity of a recently derived Bayesian effective action for shallow architectures in the proportional limit. We explore the predictive power of the theory as a function of the parameters (the temperature T, the magnitude of the Gaussian priors λ_{1}, λ_{0}, the size of the hidden layer N_{1}, and the size of the training set P) by comparing the experimental and predicted generalization error.

View Article and Find Full Text PDF