Optimizing Data Flow in Binary Neural Networks.

Lorenzo Vorabbi Davide Maltoni Stefano Santi

Sensors (Basel)

Datalogic Labs, Via San Vitalino 12, 40012 Bologna, BO, Italy.

Published: July 2024

Binary neural networks (BNNs) can substantially accelerate a neural network's inference time by substituting its costly floating-point arithmetic with bit-wise operations. Nevertheless, state-of-the-art approaches reduce the efficiency of the data flow in the BNN layers by introducing intermediate conversions from 1 to 16/32 bits. We propose a novel training scheme, denoted as that can increase the parallelism and data flow of the BNN pipeline; specifically, we introduce a clipping block that reduces the data width from 32 bits to 8. Furthermore, we decrease the internal accumulator size of a binary layer, usually kept using 32 bits to prevent data overflow, with no accuracy loss. Moreover, we propose an optimization of the batch normalization layer that reduces latency and simplifies deployment. Finally, we present an optimized implementation of the binary direct convolution for ARM NEON instruction sets. Our experiments show a consistent inference latency speed-up (up to 1.3 and 2.4× compared to two state-of-the-art BNN frameworks) while reaching an accuracy comparable with state-of-the-art approaches on datasets like CIFAR-10, SVHN, and ImageNet.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11314678	PMC
http://dx.doi.org/10.3390/s24154780	DOI Listing

Publication Analysis

Top Keywords

data flow

binary neural

neural networks

state-of-the-art approaches

flow bnn

optimizing data

binary

flow binary

networks binary

networks bnns

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!