Hierarchical Multimodal Adaptive Fusion (HMAF) Network for Prediction of RGB-D Saliency.

Comput Intell Neurosci

School of Information and Electronic Engineering, Zhejiang University of Science & Technology, Hangzhou 310023, China.

Published: July 2021

Visual saliency prediction for RGB-D images is more challenging than that for their RGB counterparts. Additionally, very few investigations have been undertaken concerning RGB-D-saliency prediction. The proposed study presents a method based on a hierarchical multimodal adaptive fusion (HMAF) network to facilitate end-to-end prediction of RGB-D saliency. In the proposed method, hierarchical (multilevel) multimodal features are first extracted from an RGB image and depth map using a VGG-16-based two-stream network. Subsequently, the most significant hierarchical features of the said RGB image and depth map are predicted using three two-input attention modules. Furthermore, adaptive fusion of saliencies concerning the above-mentioned fused saliency features of different levels (hierarchical fusion saliency features) can be accomplished using a three-input attention module to facilitate high-accuracy RGB-D visual saliency prediction. Comparisons based on the application of the proposed HMAF-based approach against those of other state-of-the-art techniques on two challenging RGB-D datasets demonstrate that the proposed method outperforms other competing approaches consistently by a considerable margin.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7700038PMC
http://dx.doi.org/10.1155/2020/8841681DOI Listing

Publication Analysis

Top Keywords

adaptive fusion
12
prediction rgb-d
12
hierarchical multimodal
8
multimodal adaptive
8
fusion hmaf
8
hmaf network
8
rgb-d saliency
8
visual saliency
8
saliency prediction
8
proposed method
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!