An FPGA Implementation of Deep Spiking Neural Networks for Low-Power and Fast Classification.

Xiping Ju Biao Fang Rui Yan Xiaoliang Xu Huajin Tang

Neural Comput

College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China, and College of Computer Science, Sichuan University, Chengdu 610065, China

Published: January 2020

A spiking neural network (SNN) processes information through spikes but faces challenges in training due to the non-differentiable nature of these signals; recent developments allow for converting convolutional neural networks (CNNs) into more effective SNNs.
A proposed hardware architecture maps all network layers onto a single chip, enabling parallel computation for reduced latency and introducing a new spiking max-pooling method to simplify calculations.
Experimental results demonstrated that the architecture achieved 98.94% accuracy on the MNIST dataset, operating at 164 frames per second, with significant speed and power efficiency improvements compared to CPU and GPU implementations.

A spiking neural network (SNN) is a type of biological plausibility model that performs information processing based on spikes. Training a deep SNN effectively is challenging due to the nondifferention of spike signals. Recent advances have shown that high-performance SNNs can be obtained by converting convolutional neural networks (CNNs). However, the large-scale SNNs are poorly served by conventional architectures due to the dynamic nature of spiking neurons. In this letter, we propose a hardware architecture to enable efficient implementation of SNNs. All layers in the network are mapped on one chip so that the computation of different time steps can be done in parallel to reduce latency. We propose new spiking max-pooling method to reduce computation complexity. In addition, we apply approaches based on shift register and coarsely grained parallels to accelerate convolution operation. We also investigate the effect of different encoding methods on SNN accuracy. Finally, we validate the hardware architecture on the Xilinx Zynq ZCU102. The experimental results on the MNIST data set show that it can achieve an accuracy of 98.94% with eight-bit quantized weights. Furthermore, it achieves 164 frames per second (FPS) under 150 MHz clock frequency and obtains 41 speed-up compared to CPU implementation and 22 times lower power than GPU implementation.

Download full-text PDF	Source
http://dx.doi.org/10.1162/neco_a_01245	DOI Listing

Publication Analysis

Top Keywords

spiking neural

neural networks

hardware architecture

fpga implementation

implementation deep

spiking

deep spiking

networks low-power

low-power fast

fast classification

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!