We describe a new 3D saliency prediction model that accounts for diverse low-level luminance, chrominance, motion, and depth attributes of 3D videos as well as high-level classifications of scenes by type. The model also accounts for perceptual factors, such as the nonuniform resolution of the human eye, stereoscopic limits imposed by Panum's fusional area, and the predicted degree of (dis) comfort felt, when viewing the 3D video. The high-level analysis involves classification of each 3D video scene by type with regard to estimated camera motion and the motions of objects in the videos. Decisions regarding the relative saliency of objects or regions are supported by data obtained through a series of eye-tracking experiments. The algorithm developed from the model elements operates by finding and segmenting salient 3D space-time regions in a video, then calculating the saliency strength of each segment using measured attributes of motion, disparity, texture, and the predicted degree of visual discomfort experienced. The saliency energy of both segmented objects and frames are weighted using models of human foveation and Panum's fusional area yielding a single predictor of 3D saliency.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TIP.2014.2303640 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!