Publications by authors named "Julien Herrmann"

Deep learning training memory needs can prevent the user from considering large models and large batch sizes. In this work, we propose to use techniques from memory-aware scheduling and automatic differentiation (AD) to execute a backpropagation graph with a bounded memory requirement at the cost of extra recomputations. The case of a single homogeneous chain, i.

View Article and Find Full Text PDF