Depth information has been demonstrated to be useful for saliency detection. However, the existing methods for RGBD saliency detection mainly focus on designing straightforward and comprehensive models, while ignoring the transferable ability of the existing RGB saliency detection models. In this article, we propose a novel depth-guided transformation model (DTM) going from RGB saliency to RGBD saliency. The proposed model includes three components, that is: 1) multilevel RGBD saliency initialization; 2) depth-guided saliency refinement; and 3) saliency optimization with depth constraints. The explicit depth feature is first utilized in the multilevel RGBD saliency model to initialize the RGBD saliency by combining the global compactness saliency cue and local geodesic saliency cue. The depth-guided saliency refinement is used to further highlight the salient objects and suppress the background regions by introducing the prior depth domain knowledge and prior refined depth shape. Benefiting from the consistency of the entire object in the depth map, we formulate an optimization model to attain more consistent and accurate saliency results via an energy function, which integrates the unary data term, color smooth term, and depth consistency term. Experiments on three public RGBD saliency detection benchmarks demonstrate the effectiveness and performance improvement of the proposed DTM from RGB to RGBD saliency.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TCYB.2019.2932005 | DOI Listing |
Sensors (Basel)
November 2024
School of Comuputer Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China.
Focusing on the issue of the low recognition rates achieved by traditional deep-information-based action recognition algorithms, an action recognition approach was developed based on skeleton spatial-temporal and dynamic features combined with a two-stream convolutional neural network (TS-CNN). Firstly, the skeleton's three-dimensional coordinate system was transformed to obtain coordinate information related to relative joint positions. Subsequently, this relevant joint information was encoded as a color texture map to construct the spatial-temporal feature descriptor of the skeleton.
View Article and Find Full Text PDFToward desirable saliency prediction, the types and numbers of inputs for a salient object detection (SOD) algorithm may dynamically change in many real-life applications. However, existing SOD algorithms are mainly designed or trained for one particular type of inputs, failing to be generalized to other types of inputs. Consequentially, more types of SOD algorithms need to be prepared in advance for handling different types of inputs, raising huge hardware and research costs.
View Article and Find Full Text PDFSensors (Basel)
February 2024
School of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, China.
Salient Object Detection (SOD) in RGB-D images plays a crucial role in the field of computer vision, with its central aim being to identify and segment the most visually striking objects within a scene. However, optimizing the fusion of multi-modal and multi-scale features to enhance detection performance remains a challenge. To address this issue, we propose a network model based on semantic localization and multi-scale fusion (SLMSF-Net), specifically designed for RGB-D SOD.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
February 2024
RGB-D salient object detection (SOD) has gained tremendous attention in recent years. In particular, transformer has been employed and shown great potential. However, existing transformer models usually overlook the vital edge information, which is a major issue restricting the further improvement of SOD accuracy.
View Article and Find Full Text PDFNeural Netw
February 2024
Department of Computing, The Hong Kong Polytechnic University, Hong Kong Special Administrative Region of China.
Image Salient Object Detection (SOD) is a fundamental research topic in the area of computer vision. Recently, the multimodal information in RGB, Depth (D), and Thermal (T) modalities has been proven to be beneficial to the SOD. However, existing methods are only designed for RGB-D or RGB-T SOD, which may limit the utilization in various modalities, or just finetuned on specific datasets, which may bring about extra computation overhead.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!