IEEE Trans Pattern Anal Mach Intell
August 2023
Segmenting highly-overlapping image objects is challenging, because there is typically no distinction between real object contours and occlusion boundaries on images. Unlike previous instance segmentation methods, we model image formation as a composition of two overlapping layers, and propose Bilayer Convolutional Network (BCNet), where the top layer detects occluding objects (occluders) and the bottom layer infers partially occluded instances (occludees). The explicit modeling of occlusion relationship with bilayer structure naturally decouples the boundaries of both the occluding and occluded instances, and considers the interaction between them during mask regression.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
September 2023
In this paper, we present a novel end-to-end group collaborative learning network, termed GCoNet+, which can effectively and efficiently (250 fps) identify co-salient objects in natural scenes. The proposed GCoNet+ achieves the new state-of-the-art performance for co-salient object detection (CoSOD) through mining consensus representations based on the following two essential criteria: 1) intra-group compactness to better formulate the consistency among co-salient objects by capturing their inherent shared attributes using our novel group affinity module (GAM); 2) inter-group separability to effectively suppress the influence of noisy objects on the output by introducing our new group collaborating module (GCM) conditioning on the inconsistent consensus. To further improve the accuracy, we design a series of simple yet effective components as follows: i) a recurrent auxiliary classification module (RACM) promoting model learning at the semantic level; ii) a confidence enhancement module (CEM) assisting the model in improving the quality of the final predictions; and iii) a group-based symmetric triplet (GST) loss guiding the model to learn more discriminative features.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
December 2022
Point cloud analysis without pose priors is very challenging in real applications, as the orientations of point clouds are often unknown. In this paper, we propose a brand new point-set learning framework PRIN, namely, Point-wise Rotation Invariant Network, focusing on rotation invariant feature extraction in point clouds analysis. We construct spherical signals by Density Aware Adaptive Sampling to deal with distorted point distributions in spherical space.
View Article and Find Full Text PDFIEEE Trans Image Process
December 2021
Human pose transfer has been becoming one of the emerging research topics in recent years. However, state-of-the-art results are still far from satisfactory. One main reason is that these end-to-end methods are often blindly trained without the semantic understanding of its content.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
September 2022
We propose HyFRIS-Net to jointly estimate the hybrid reflectance and illumination models, as well as the refined face shape from a single unconstrained face image in a pre-defined texture space. The proposed hybrid reflectance and illumination representation ensure photometric face appearance modeling in both parametric and non-parametric spaces for efficient learning. While forcing the reflectance consistency constraint for the same person and face identity constraint for different persons, our approach recovers an occlusion-free face albedo with disambiguated color from the illumination color.
View Article and Find Full Text PDFIEEE Trans Image Process
July 2021
In this paper, we propose a new method to super-resolve low resolution human body images by learning efficient multi-scale features and exploiting useful human body prior. Specifically, we propose a lightweight multi-scale block (LMSB) as basic module of a coherent framework, which contains an image reconstruction branch and a prior estimation branch. In the image reconstruction branch, the LMSB aggregates features of multiple receptive fields so as to gather rich context information for low-to-high resolution mapping.
View Article and Find Full Text PDFIEEE Trans Image Process
December 2020
Person re-identification aims to identify whether pairs of images belong to the same person or not. This problem is challenging due to large differences in camera views, lighting and background. One of the mainstream in learning CNN features is to design loss functions which reinforce both the class separation and intra-class compactness.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
July 2021
We present an algorithm to directly solve numerous image restoration problems (e.g., image deblurring, image dehazing, and image deraining).
View Article and Find Full Text PDFIEEE Trans Image Process
March 2019
We propose a deep convolutional neural network (CNN) method for natural image matting. Our method takes multiple initial alpha mattes of the previous methods and normalized RGB color images as inputs, and directly learns an end-to-end mapping between the inputs and reconstructed alpha mattes. Among the various existing methods, we focus on using two simple methods as initial alpha mattes: the closed-form matting and KNN matting.
View Article and Find Full Text PDFWhile conventional calibrated photometric stereo methods assume that light intensities and sensor exposures are known or unknown but identical across observed images, this assumption easily breaks down in practical settings due to individual light bulb's characteristics and limited control over sensors. This paper studies the effect of unknown and possibly non-uniform light intensities and sensor exposures among observed images on the shape recovery based on photometric stereo. This leads to the development of a "semi-calibrated" photometric stereo method, where the light directions are known but light intensities (and sensor exposures) are unknown.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
February 2019
One of the core applications of light field imaging is depth estimation. To acquire a depth map, existing approaches apply a single photo-consistency measure to an entire light field. However, this is not an optimal choice because of the non-uniform light field degradations produced by limitations in the hardware design.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
July 2018
Recent advances in saliency detection have utilized deep learning to obtain high-level features to detect salient regions in scenes. These advances have yielded results superior to those reported in past work, which involved the use of hand-crafted low-level features for saliency detection. In this paper, we propose ELD-Net, a unified deep learning framework for accurate and efficient saliency detection.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
February 2018
Rank minimization can be converted into tractable surrogate problems, such as Nuclear Norm Minimization (NNM) and Weighted NNM (WNNM). The problems related to NNM, or WNNM, can be solved iteratively by applying a closed-form proximal operator, called Singular Value Thresholding (SVT), or Weighted SVT, but they suffer from high computational cost of Singular Value Decomposition (SVD) at each iteration. We propose a fast and accurate approximation method for SVT, that we call fast randomized SVT (FRSVT), with which we avoid direct computation of SVD.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
August 2017
We propose a robust uncalibrated multiview photometric stereo method for high quality 3D shape reconstruction. In our method, a coarse initial 3D mesh obtained using a multiview stereo method is projected onto a 2D planar domain using a planar mesh parameterization technique. We describe methods for surface normal estimation that work in the parameterized 2D space that jointly incorporates all geometric and photometric cues from multiple viewpoints.
View Article and Find Full Text PDFThis paper presents an automatic method to extract a multi-view object in a natural environment. We assume that the target object is bounded by the convex volume of interest defined by the overlapping space of camera viewing frustums. There are two key contributions of our approach.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
August 2017
In this paper, we introduce an automatic approach to generate trimaps and consistent alpha mattes of foreground objects in a light-field image. Our method first performs binary segmentation to roughly segment a light-field image into foreground and background based on depth and color. Next, we estimate accurate trimaps through analyzing color distribution along the boundary of the segmentation using guided image filter and KL-divergence.
View Article and Find Full Text PDFIEEE Trans Image Process
April 2016
We present an unconventional image super-resolution algorithm targeting focal stack images. Contrary to previous works, which align multiple images with sub-pixel accuracy for image super-resolution, we analyze the correlation among the differently focused narrow depth-of-field images in a focal stack to infer high-resolution details for image super-resolution. In order to accurately model the defocus kernels at different depths, we use a cubic interpolation to parameterize the projection of defocus kernels, and apply the radon transform to accurately reconstruct the defocus kernels at arbitrary depth.
View Article and Find Full Text PDFIEEE Trans Image Process
January 2016
In this paper, we introduce a novel approach to automatically detect salient regions in an image. Our approach consists of global and local features, which complement each other to compute a saliency map. The first key idea of our work is to create a saliency map of an image by using a linear combination of colors in a high-dimensional color space.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
June 2015
This paper introduces a new high dynamic range (HDR) imaging algorithm which utilizes rank minimization. Assuming a camera responses linearly to scene radiance, the input low dynamic range (LDR) images captured with different exposure time exhibit a linear dependency and form a rank-1 matrix when stacking intensity of each corresponding pixel together. In practice, misalignments caused by camera motion, presences of moving objects, saturations and image noise break the rank-1 structure of the LDR images.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
April 2016
Robust Principal Component Analysis (RPCA) via rank minimization is a powerful tool for recovering underlying low-rank structure of clean data corrupted with sparse noise/outliers. In many low-level vision problems, not only it is known that the underlying structure of clean data is low-rank, but the exact rank of clean data is also known. Yet, when applying conventional rank minimization for those problems, the objective function is formulated in a way that does not fully utilize a priori target rank information about the problems.
View Article and Find Full Text PDFIEEE Trans Image Process
December 2014
This paper describes an application framework to perform high-quality upsampling and completion on noisy depth maps. Our framework targets a complementary system setup, which consists of a depth camera coupled with an RGB camera. Inspired by a recent work that uses a nonlocal structure regularization, we regularize depth maps in order to maintain fine details and structures.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
February 2014
We propose a physically-based approach to separate reflection using multiple polarized images with a background scene captured behind glass. The input consists of three polarized images, each captured from the same view point but with a different polarizer angle separated by 45 degrees. The output is the high-quality separation of the reflection and background layers from each of the input images.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
October 2013
This paper investigates the role that nonlinear camera response functions (CRFs) have on image deblurring. We present a comprehensive study to analyze the effects of CRFs on motion deblurring. In particular, we show how nonlinear CRFs can cause a spatially invariant blur to behave as a spatially varying blur.
View Article and Find Full Text PDFIn this paper, we deal with a problem of separating the effect of reflection from images captured behind glass. The input consists of multiple polarized images captured from the same view point but with different polarizer angles. The output is the high quality separation of the reflection layer and the background layer from the images.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
November 2011
This paper addresses the problem of matting motion blurred objects from a single image. Existing single image matting methods are designed to extract static objects that have fractional pixel occupancy. This arises because the physical scene object has a finer resolution than the discrete image pixel and therefore only occupies a fraction of the pixel.
View Article and Find Full Text PDF