In this article, we propose a set of transform-based neural network layers as an alternative to the 3 x 3 Conv2D layers in convolutional neural networks (CNNs). The proposed layers can be implemented based on orthogonal transforms, such as the discrete cosine transform (DCT), Hadamard transform (HT), and biorthogonal block wavelet transform (BWT). Furthermore, by taking advantage of the convolution theorems, convolutional filtering operations are performed in the transform domain using elementwise multiplications. Trainable soft-thresholding layers, that remove noise in the transform domain, bring nonlinearity to the transform domain layers. Compared with the Conv2D layer, which is spatial-agnostic and channel-specific, the proposed layers are location-specific and channel-specific. Moreover, these proposed layers reduce the number of parameters and multiplications significantly while improving the accuracy results of regular ResNets on the ImageNet-1K classification task. Furthermore, they can be inserted with a batch normalization (BN) layer before the global average pooling layer in the conventional ResNets as an additional layer to improve classification accuracy.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TNNLS.2024.3384316DOI Listing

Publication Analysis

Top Keywords

proposed layers
12
transform domain
12
layers
8
channel-specific proposed
8
transform
6
multichannel orthogonal
4
orthogonal transform-based
4
transform-based perceptron
4
perceptron layers
4
layers efficient
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!