A Hardware-Friendly Low-Bit Power-of-Two Quantization Method for CNNs and Its FPGA Implementation.

Xuefu Sui Qunbo Lv Yang Bai Baoyu Zhu Liangjie Zhi Yuanbo Yang Zheng Tan

Sensors (Basel)

Aerospace Information Research Institute, Chinese Academy of Sciences, No. 9 Dengzhuang South Road, Haidian District, Beijing 100094, China.

Published: September 2022

To address the problems of convolutional neural networks (CNNs) consuming more hardware resources (such as DSPs and RAMs on FPGAs) and their accuracy, efficiency, and resources being difficult to balance, meaning they cannot meet the requirements of industrial applications, we proposed an innovative low-bit power-of-two quantization method: the global sign-based network quantization (GSNQ). This method involves designing different quantization ranges according to the sign of the weights, which can provide a larger quantization-value range. Combined with the fine-grained and multi-scale global retraining method proposed in this paper, the accuracy loss of low-bit quantization can be effectively reduced. We also proposed a novel convolutional algorithm using shift operations to replace multiplication to help to deploy the GSNQ quantized models on FPGAs. Quantization comparison experiments performed on LeNet-5, AlexNet, VGG-Net, ResNet, and GoogLeNet showed that GSNQ has higher accuracy than most existing methods and achieves "lossless" quantization (i.e., the accuracy of the quantized CNN model is higher than the baseline) at low-bit quantization in most cases. FPGA comparison experiments showed that our convolutional algorithm does not occupy on-chip DSPs, and it also has a low comprehensive occupancy in terms of on-chip LUTs and FFs, which can effectively improve the computational parallelism, and this proves that GSNQ has good hardware-adaptation capability. This study provides theoretical and experimental support for the industrial application of CNNs.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9460272	PMC
http://dx.doi.org/10.3390/s22176618	DOI Listing

Publication Analysis

Top Keywords

low-bit power-of-two

quantization

power-of-two quantization

quantization method

low-bit quantization

convolutional algorithm

comparison experiments

hardware-friendly low-bit

method

method cnns

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!