Publications by authors named "Sreenivas Subramoney"

In-memory computing techniques are used to accelerate artificial neural network (ANN) training and inference tasks. Memory technology and architectural innovations allow efficient matrix-vector multiplications, gradient calculations, and updates to network weights. However, on-chip learning for edge devices is quite challenging due to the frequent updates.

View Article and Find Full Text PDF

CPU is a powerful, pervasive, and indispensable platform for running deep learning (DL) workloads in systems ranging from mobile to extreme-end servers. In this article, we present a survey of techniques for optimizing DL applications on CPUs. We include the methods proposed for both inference and training and those offered in the context of mobile, desktop/server, and distributed systems.

View Article and Find Full Text PDF