Encoder-Driven Inpainting Strategy in Multiview Video Compression.

Yu Gao Gene Cheung Thomas Maugey Pascal Frossard Jie Liang

IEEE Trans Image Process

Published: January 2016

In free viewpoint video systems, a user has the freedom to select a virtual view from which an image of the 3D scene is rendered, and the scene is commonly represented by color and depth images of multiple nearby viewpoints. In such representation, there exists data redundancy across multiple dimensions: 1) a 3D voxel may be represented by pixels in multiple viewpoint images (inter-view redundancy); 2) a pixel patch may recur in a distant spatial region of the same image due to self-similarity (inter-patch redundancy); and 3) pixels in a local spatial region tend to be similar (inter-pixel redundancy). It is important to exploit these redundancies during inter-view prediction toward effective multiview video compression. In this paper, we propose an encoder-driven inpainting strategy for inter-view predictive coding, where explicit instructions are transmitted minimally, and the decoder is left to independently recover remaining missing data via inpainting, resulting in lower coding overhead. In particular, after pixels in a reference view are projected to a target view via depth-image-based rendering at the decoder, the remaining holes in the target view are filled via an inpainting process in a block-by-block manner. First, blocks are ordered in terms of difficulty-to-inpaint by the decoder. Then, explicit instructions are only sent for the reconstruction of the most difficult blocks. In particular, the missing pixels are explicitly coded via a graph Fourier transform or a sparsification procedure using discrete cosine transform, leading to low coding cost. For blocks that are easy to inpaint, the decoder independently completes missing pixels via template-based inpainting. We apply our proposed scheme to frames in a prediction structure defined by JCT-3V where inter-view prediction is dominant, and experimentally we show that our scheme achieves up to 3-dB gain in peak-signal-to-noise-ratio in reconstructed image quality over a comparable 3D-High Efficiency Video Coding implementation using fixed 16 $\times $ 16 block size.

Download full-text PDF	Source
http://dx.doi.org/10.1109/TIP.2015.2498400	DOI Listing

Publication Analysis

Top Keywords

encoder-driven inpainting

inpainting strategy

multiview video

video compression

spatial region

inter-view prediction

explicit instructions

target view

missing pixels

pixels

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!