In image classification with Deep Convolutional Neural Networks (DCNNs), the number of parameters in pointwise convolutions rapidly grows due to the multiplication of the number of filters by the number of input channels that come from the previous layer. Existing studies demonstrated that a subnetwork can replace pointwise convolutional layers with significantly fewer parameters and fewer floating-point computations, while maintaining the learning capacity. In this paper, we propose an improved scheme for reducing the complexity of pointwise convolutions in DCNNs for image classification based on interleaved grouped filters without divisibility constraints. The proposed scheme utilizes grouped pointwise convolutions, in which each group processes a fraction of the input channels. It requires a number of channels per group as a hyperparameter Ch. The subnetwork of the proposed scheme contains two consecutive convolutional layers K and L, connected by an interleaving layer in the middle, and summed at the end. The number of groups of filters and filters per group for layers K and L is determined by exact divisions of the original number of input channels and filters by Ch. If the divisions were not exact, the original layer could not be substituted. In this paper, we refine the previous algorithm so that input channels are replicated and groups can have different numbers of filters to cope with non exact divisibility situations. Thus, the proposed scheme further reduces the number of floating-point computations (11%) and trainable parameters (10%) achieved by the previous method. We tested our optimization on an EfficientNet-B0 as a baseline architecture and made classification tests on the CIFAR-10, Colorectal Cancer Histology, and Malaria datasets. For each dataset, our optimization achieves a saving of 76%, 89%, and 91% of the number of trainable parameters of EfficientNet-B0, while keeping its test classification accuracy.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9497893PMC
http://dx.doi.org/10.3390/e24091264DOI Listing

Publication Analysis

Top Keywords

pointwise convolutions
16
input channels
16
image classification
12
proposed scheme
12
scheme reducing
8
reducing complexity
8
complexity pointwise
8
classification based
8
based interleaved
8
interleaved grouped
8

Similar Publications

MDWConv:CNN based on multi-scale atrous pyramid and depthwise separable convolution for long time series forecasting.

Neural Netw

January 2025

School of Software, Shandong University, Jinan 250101, China; Shandong Provincial Laboratory of Future Intelligence and Financial Engineering, Yantai 264005, China. Electronic address:

Long time series forecasting has extensive applications in various fields such as power dispatching, traffic control, and weather forecasting. Recently, models based on the Transformer architecture have dominated the field of time series forecasting. However, these methods lack the ability to handle the correlation of multi-scale information and the interaction of information between variables in model design.

View Article and Find Full Text PDF

An efficient deep learning system for automatic detection of Acute Lymphoblastic Leukemia.

ISA Trans

January 2025

Department of Electronics and Telecommunication, C. V. Raman Global University, Bhubaneswar 752054, Odisha, India. Electronic address:

Early and highly accurate detection of rapidly damaging deadly disease like Acute Lymphoblastic Leukemia (ALL) is essential for providing appropriate treatment to save valuable lives. Recent development in deep learning, particularly transfer learning, is gaining a preferred trend of research in medical image processing because of their admirable performance, even with small datasets. It inspires us to develop a novel deep learning-based leukemia detection system in which an efficient and lightweight MobileNetV2 is used in conjunction with ShuffleNet to boost discrimination ability and enhance the receptive field via convolution layer succession.

View Article and Find Full Text PDF

Nasalance is a valuable clinical biomarker for hypernasality. It is computed as the ratio of acoustic energy emitted through the nose to the total energy emitted through the mouth and nose (eNasalance). A new approach is proposed to compute nasalance using Convolutional Neural Networks (CNNs) trained with Mel-Frequency Cepstrum Coefficients (mfccNasalance).

View Article and Find Full Text PDF

Gait recognition is a behavioral biometric technique that identifies individuals based on their unique walking patterns, enabling long-distance identification. Traditional gait recognition methods rely on appearance-based approaches that utilize background-subtracted silhouette sequences to extract gait features. While effective and easy to compute, these methods are susceptible to variations in clothing, carried objects, and illumination changes, compromising the extraction of discriminative features in real-world applications.

View Article and Find Full Text PDF

Background: Dental disorders are one of the most important health problems, affecting billions of people all over the world. Early diagnosis is important for effective treatment planning. Precise dental disease segmentation requires reliable tooth numbering, which may be prone to errors if performed manually.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!