Image matting is a fundamental and challenging problem in computer vision and graphics. Most existing matting methods leverage a user-supplied trimap as an auxiliary input to produce good alpha matte. However, obtaining high-quality trimap itself is arduous. Recently, some hint-free methods have emerged, however, the matting quality is still far behind the trimap-based methods. The main reason is that, some hints for removing semantic ambiguity and improving matting quality are essential. Apparently, there is a trade-off between interaction cost and matting quality. To balance performance and user-friendliness, we propose an improved deep image matting framework which is trimap-free and only needs sparse user click or scribble interaction to minimize the needed auxiliary constraints while still allowing interactivity. Moreover, we introduce uncertainty estimation that predicts which parts need polishing and conduct uncertainty-guided refinement. To trade off runtime against refinement quality, users can also choose different refinement modes. Experimental results show that our method performs better than existing trimap-free methods and comparably to state-of-the-art trimap-based methods with minimal user effort. Finally, we demonstrate the extensibility of our framework to video human matting without any structure modification, by adding optical flow-based sparse hint propagation and temporal consistency regularization imposed on the single frame.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TPAMI.2023.3326693 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!