RGB-D saliency detection aims to accurately localize salient regions using the complementary information of a depth map. Global contexts carried by the deep layer are key to salient objection detection, but they are diluted when transferred to shallower layers. Besides, depth maps may contain misleading information due to the depth sensors. To tackle these issues, in this paper, we propose a new cross-modal cross-scale network for RGB-D salient object detection, where the global context information provides global guidance to boost performance in complex scenarios. First, we introduce a global guided cross-modal and cross-scale module named GCMCSM to realize global guided cross-modal cross-scale fusion. Then, we employ feature refinement modules for progressive refinement in a coarse-to-fine manner. In addition, we adopt a hybrid loss function to supervise the training of GCMCSNet over different scales. With all these modules working together, GCMCSNet effectively enhances both salient object details and salient object localization. Extensive experiments on challenging benchmark datasets demonstrate that our GCMCSNet outperforms existing state-of-the-art methods.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10459329 | PMC |
http://dx.doi.org/10.3390/s23167221 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!