Magnitude and angle dynamics in training single ReLU neurons.

Neural Netw

Kim Jaechul Graduate School of AI, KAIST, Daejeon, Republic of Korea. Electronic address:

Published: October 2024

Understanding the training dynamics of deep ReLU networks is a significant area of interest in deep learning. However, there remains a lack of complete elucidation regarding the weight vector dynamics, even for single ReLU neurons. To bridge this gap, our study delves into the training dynamics of the gradient flow w(t) for single ReLU neurons under the square loss, dissecting it into its magnitude ‖w(t)‖ and angle φ(t) components. Through this decomposition, we establish upper and lower bounds on these components to elucidate the convergence dynamics. Furthermore, we demonstrate the empirical extension of our findings to general two-layer multi-neuron networks. All theoretical results are generalized to the gradient descent method and rigorously verified through experiments.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.neunet.2024.106435DOI Listing

Publication Analysis

Top Keywords

single relu
12
relu neurons
12
training dynamics
8
dynamics
5
magnitude angle
4
angle dynamics
4
dynamics training
4
training single
4
relu
4
neurons understanding
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!