Enhanced mechanisms of pooling and channel attention for deep learning feature maps.

PeerJ Comput Sci

College of Science and Engineering, Ritsumeikan University, Kusatsu, Shiga, Japan.

Published: November 2022

The pooling function is vital for deep neural networks (DNNs). The operation is to generalize the representation of feature maps and progressively cut down the spatial size of feature maps to optimize the computing consumption of the network. Furthermore, the function is also the basis for the computer vision attention mechanism. However, as a matter of fact, pooling is a down-sampling operation, which makes the feature-map representation approximately to small translations with the summary statistic of adjacent pixels. As a result, the function inevitably leads to information loss more or less. In this article, we propose a fused max-average pooling (FMAPooling) operation as well as an improved channel attention mechanism (FMAttn) by utilizing the two pooling functions to enhance the feature representation for DNNs. Basically, the methods are to enhance multiple-level features extracted by max pooling and average pooling respectively. The effectiveness of the proposals is verified with VGG, ResNet, and MobileNetV2 architectures on CIFAR10/100 and ImageNet100. According to the experimental results, the FMAPooling brings up to 1.63% accuracy improvement compared with the baseline model; the FMAttn achieves up to 2.21% accuracy improvement compared with the previous channel attention mechanism. Furthermore, the proposals are extensible and could be embedded into various DNN models easily, or take the place of certain structures of DNNs. The computation burden introduced by the proposals is negligible.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9748832PMC
http://dx.doi.org/10.7717/peerj-cs.1161DOI Listing

Publication Analysis

Top Keywords

channel attention
12
feature maps
12
attention mechanism
12
accuracy improvement
8
improvement compared
8
pooling
7
enhanced mechanisms
4
mechanisms pooling
4
pooling channel
4
attention
4

Similar Publications

Marine pollution due to oil spills presents major risks to coastal areas and aquatic life, leading to serious environmental health concerns. Oil Spill detection using SAR data has transitioned from traditional segmentation to a variety of machine learning & deep learning models like UNET proving its efficiency for the task. This research paper proposes a GSCAT-UNET model for efficient oil spill detection and discrimination from lookalikes.

View Article and Find Full Text PDF

Municipal waste classification is significant for effective recycling and waste management processes that involve the classification of diverse municipal waste materials such as paper, glass, plastic, and organic matter using diverse techniques. Yet, this municipal waste classification process faces several challenges, such as high computational complexity, more time consumption, and high variability in the appearance of waste caused by variations in color, type, and degradation level, which makes an inaccurate waste classification process. To overcome these challenges, this research proposes a novel Channel and Spatial Attention-Based Multiblock Convolutional Network for accurately classifying municipal waste that utilizes a unique attention mechanism for enhancing feature learning and waste classification accuracy.

View Article and Find Full Text PDF

Remaining Useful Life Prediction of Rolling Bearings Based on CBAM-CNN-LSTM.

Sensors (Basel)

January 2025

School of Mechanical and Vehicle Engineering, Changchun University, Changchun 130022, China.

Predicting the Remaining Useful Life (RUL) is vital for ensuring the reliability and safety of equipment and components. This study introduces a novel method for predicting RUL that utilizes the Convolutional Block Attention Module (CBAM) to address the problem that Convolutional Neural Networks (CNNs) do not effectively leverage data channel features and spatial features in residual life prediction. Firstly, Fast Fourier Transform (FFT) is applied to convert the data into the frequency domain.

View Article and Find Full Text PDF

Spectrum sensing is recognized as a viable strategy to alleviate the scarcity of spectrum resources and to optimize their usage. In this paper, considering the time-varying characteristics and the dependence on various timescales within a time series of samples composed of in-phase (I) and quadrature (Q) component signals, we propose a multi-scale time-correlated perceptual attention model named MSTC-PANet. The model consists of multiple parallel temporal correlation perceptual attention (TCPA) modules, enabling us to extract features at different timescales and identify dependencies among features across various timescales.

View Article and Find Full Text PDF

Remaining useful life (RUL) prediction is a cornerstone of Prognostic and Health Management (PHM) for power machinery, playing a crucial role in ensuring the reliability and safety of these critical systems. In recent years, deep learning techniques have shown great promise in RUL prediction, providing more reliable and accurate outcomes. However, existing models often struggle with comprehensive feature extraction, especially in capturing the complex behavior of power machinery, where non-linear degradation patterns arise under varying operational conditions.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!