Publications by authors named "Bumsub Ham"

We address the problem of person re-identification (reID), that is, retrieving person images from a large dataset, given a query image of the person of interest. A key challenge is to learn person representations robust to intra-class variations, as different persons could have the same attribute, and persons' appearances look different, e.g.

View Article and Find Full Text PDF

We address the problem of semantic correspondence, that is, establishing a dense flow field between images depicting different instances of the same object or scene category. We propose to use images annotated with binary foreground masks and subjected to synthetic geometric deformations to train a convolutional neural network (CNN) for this task. Using these masks as part of the supervisory signal provides an object-level prior for the semantic correspondence task and offers a good compromise between semantic flow methods, where the amount of training data is limited by the cost of manually selecting point correspondences, and semantic alignment ones, where the regression of a single global geometric transformation between images may be sensitive to image-specific details such as background clutter.

View Article and Find Full Text PDF

Most variational formulations for structure-texture image decomposition force structure images to have small norm in some functional spaces, and share a common notion of edges, i.e., large-gradients or -intensity differences.

View Article and Find Full Text PDF

We present a descriptor, called fully convolutional self-similarity (FCSS), for dense semantic correspondence. Unlike traditional dense correspondence approaches for estimating depth or optical flow, semantic correspondence estimation poses additional challenges due to intra-class appearance and shape variations among different instances within the same object or scene category. To robustly match points across semantically similar images, we formulate FCSS using local self-similarity (LSS), which is inherently insensitive to intra-class appearance variations.

View Article and Find Full Text PDF

Finding image correspondences remains a challenging problem in the presence of intra-class variations and large changes in scene layout. Semantic flow methods are designed to handle images depicting different instances of the same object or scene category. We introduce a novel approach to semantic flow, dubbed proposal flow, that establishes reliable correspondences using object proposals.

View Article and Find Full Text PDF

Filtering images using a guidance signal, a process called guided or joint image filtering, has been used in various tasks in computer vision and computational photography, particularly for noise reduction and joint upsampling. This uses an additional guidance signal as a structure prior, and transfers the structure of the guidance signal to an input image, restoring noisy or altered image structure. The main drawbacks of such a data-dependent framework are that it does not consider structural differences between guidance and input images, and that it is not robust to outliers.

View Article and Find Full Text PDF

Establishing dense correspondences between multiple images is a fundamental task in many applications. However, finding a reliable correspondence between multi-modal or multi-spectral images still remains unsolved due to their challenging photometric and geometric variations. In this paper, we propose a novel dense descriptor, called dense adaptive self-correlation (DASC), to estimate dense multi-modal and multi-spectral correspondences.

View Article and Find Full Text PDF

This paper describes a method for high-quality depth superresolution. The standard formulations of image-guided depth upsampling, using simple joint filtering or quadratic optimization, lead to texture copying and depth bleeding artifacts. These artifacts are caused by inherent discrepancy of structures in data from different sensors.

View Article and Find Full Text PDF

Inferring scene depth from a single monocular image is a highly ill-posed problem in computer vision. This paper presents a new gradient-domain approach, called depth analogy, that makes use of analogy as a means for synthesizing a target depth field, when a collection of RGB-D image pairs is given as training data. Specifically, the proposed method employs a non-parametric learning process that creates an analogous depth field by sampling reliable depth gradients using visual correspondence established on training image pairs.

View Article and Find Full Text PDF

This paper presents a texture flow estimation method that uses an appearance-space clustering and a correspondence search in the space of deformed exemplars. To estimate the underlying texture flow, such as scale, orientation, and texture label, most existing approaches require a certain amount of user interactions. Strict assumptions on a geometric model further limit the flow estimation to such a near-regular texture as a gradient-like pattern.

View Article and Find Full Text PDF

This paper presents a depth superresolution (SR) method that uses both of a low-resolution (LR) depth image and a high-resolution (HR) intensity image. We formulate depth SR as a graph-based transduction problem. In particular, the HR intensity image is represented as an undirected graph, in which pixels are characterized as vertices, and their relations are encoded as an affinity function.

View Article and Find Full Text PDF

This paper presents an efficient technique for performing a spatially inhomogeneous edge-preserving image smoothing, called fast global smoother. Focusing on sparse Laplacian matrices consisting of a data term and a prior term (typically defined using four or eight neighbors for 2D image), our approach efficiently solves such global objective functions. In particular, we approximate the solution of the memory-and computation-intensive large linear system, defined over a d-dimensional spatial domain, by solving a sequence of 1D subsystems.

View Article and Find Full Text PDF

In this paper, a probability-based rendering (PBR) method is described for reconstructing an intermediate view with a steady-state matching probability (SSMP) density function. Conventionally, given multiple reference images, the intermediate view is synthesized via the depth image-based rendering technique in which geometric information (e.g.

View Article and Find Full Text PDF

In this paper, the origin of random walk with restart (RWR) and its generalization are described. It is well known that the random walk (RW) and the anisotropic diffusion models share the same energy functional, i.e.

View Article and Find Full Text PDF

In this paper, a space-time hole filling approach is presented to deal with a disocclusion when a view is synthesized for the 3D video. The problem becomes even more complicated when the view is extrapolated from a single view, since the hole is large and has no stereo depth cues. Although many techniques have been developed to address this problem, most of them focus only on view interpolation.

View Article and Find Full Text PDF

Anisotropic diffusion has been known to be closely related to adaptive smoothing and discretized in a similar manner. This paper revisits a fundamental relationship between two approaches. It is shown that adaptive smoothing and anisotropic diffusion have different theoretical backgrounds by exploring their characteristics with the perspective of normalization, evolution step size, and energy flow.

View Article and Find Full Text PDF

This paper describes a robust scale-space filter that adaptively changes the amount of flux according to the local topology of the neighborhood. In a manner similar to modeling heat or temperature flow in physics, the robust scale-space filter is derived by coupling Fick's law with a generalized continuity equation in which the source or sink is modeled via a specific heat capacity. The filter plays an essential part in two aspects.

View Article and Find Full Text PDF