Design of a 2-Bit Neural Network Quantizer for Laplacian Source.

Entropy (Basel)

Department of Computer Science, Faculty of Science, Technology and Medicine, University of Luxembourg, Avenue de la Fonte 6, L-4364 Esch-sur-Alzette, Luxembourg.

Published: July 2021

Achieving real-time inference is one of the major issues in contemporary neural network applications, as complex algorithms are frequently being deployed to mobile devices that have constrained storage and computing power. Moving from a full-precision neural network model to a lower representation by applying quantization techniques is a popular approach to facilitate this issue. Here, we analyze in detail and design a 2-bit uniform quantization model for Laplacian source due to its significance in terms of implementation simplicity, which further leads to a shorter processing time and faster inference. The results show that it is possible to achieve high classification accuracy (more than 96% in the case of MLP and more than 98% in the case of CNN) by implementing the proposed model, which is competitive to the performance of the other quantization solutions with almost optimal precision.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8393619PMC
http://dx.doi.org/10.3390/e23080933DOI Listing

Publication Analysis

Top Keywords

neural network
12
design 2-bit
8
laplacian source
8
2-bit neural
4
network quantizer
4
quantizer laplacian
4
source achieving
4
achieving real-time
4
real-time inference
4
inference major
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!