While 3D visual saliency aims to predict regional importance of 3D surfaces in agreement with human visual perception and has been well researched in computer vision and graphics, latest work with eye-tracking experiments shows that state-of-the-art 3D visual saliency methods remain poor at predicting human fixations. Cues emerging prominently from these experiments suggest that 3D visual saliency might associate with 2D image saliency. This paper proposes a framework that combines a Generative Adversarial Network and a Conditional Random Field for learning visual saliency of both a single 3D object and a scene composed of multiple 3D objects with image saliency ground truth to 1) investigate whether 3D visual saliency is an independent perceptual measure or just a derivative of image saliency and 2) provide a weakly supervised method for more accurately predicting 3D visual saliency. Through extensive experiments, we not only demonstrate that our method significantly outperforms the state-of-the-art approaches, but also manage to answer the interesting and worthy question proposed within the title of this paper.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TPAMI.2023.3287356 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!