The rise of large-scale Transformers has led to challenges regarding computational costs and energy consumption. In this context, spiking neural networks (SNNs) offer potential solutions due to their energy efficiency and processing speed. However, the inaccuracy of surrogate gradients and feature space quantization pose challenges for directly training deep SNN Transformers. To tackle these challenges, we propose a method (called LDD) to align ANN and SNN features across different abstraction levels in a Transformer network. LDD incorporates structured feature knowledge from ANNs to guide SNN training, ensuring the preservation of crucial information and addressing inaccuracies in surrogate gradients through designing layer-wise distillation losses. The proposed approach outperforms existing methods on the CIFAR10 (96.1%), CIFAR100 (82.3%), and ImageNet (80.9%) datasets, and enables training of the deepest SNN Transformer network using ImageNet.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11274868PMC
http://dx.doi.org/10.3390/biomimetics9070413DOI Listing

Publication Analysis

Top Keywords

training deep
8
spiking neural
8
neural network
8
surrogate gradients
8
transformer network
8
ldd high-precision
4
training
4
high-precision training
4
deep spiking
4
network
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!