Publications by authors named "Kai-Kuang Ma"

Detecting ellipses poses a challenging low-level task indispensable to many image analysis applications. Existing ellipse detection methods commonly encounter two fundamental issues. First, inferior detection accuracy could be incurred on a small ellipse than that on a large one; this introduces the scale issue.

View Article and Find Full Text PDF

In this paper, a full-reference video quality assessment (VQA) model is designed for the perceptual quality assessment of the screen content videos (SCVs), called the hybrid spatiotemporal feature-based model (HSFM). The SCVs are of hybrid structure including screen and natural scenes, which are perceived by the human visual system (HVS) with different visual effects. With this consideration, the three dimensional Laplacian of Gaussian (3D-LOG) filter and three dimensional Natural Scene Statistics (3D-NSS) are exploited to extract the screen and natural spatiotemporal features, based on the reference and distorted SCV sequences separately.

View Article and Find Full Text PDF

This paper proposes a new full-reference image quality assessment (IQA) model for performing perceptual quality evaluation on light field (LF) images, called the spatial and geometry feature-based model (SGFM). Considering that the LF image describe both spatial and geometry information of the scene, the spatial features are extracted over the sub-aperture images (SAIs) by using contourlet transform and then exploited to reflect the spatial quality degradation of the LF images, while the geometry features are extracted across the adjacent SAIs based on 3D-Gabor filter and then explored to describe the viewing consistency loss of the LF images. These schemes are motivated and designed based on the fact that the human eyes are more interested in the scale, direction, contour from the spatial perspective and viewing angle variations from the geometry perspective.

View Article and Find Full Text PDF

A new multi-scale deep learning (MDL) framework is proposed and exploited for conducting image interpolation in this paper. The core of the framework is a seeding network that needs to be designed for the targeted task. For image interpolation, a novel attention-aware inception network (AIN) is developed as the seeding network; it has two key stages: 1) feature extraction based on the low-resolution input image; and 2) feature-to-image mapping to enlarge image's size or resolution.

View Article and Find Full Text PDF

Lossy compression brings artifacts into the compressed image and degrades the visual quality. In recent years, many compression artifacts removal methods based on convolutional neural network (CNN) have been developed with great success. However, these methods usually train a model based on one specific value or a small range of quality factors.

View Article and Find Full Text PDF

Existing neural networks proposed for low-level image processing tasks are usually implemented by stacking convolution layers with limited kernel size. Every convolution layer merely involves in context information from a small local neighborhood. More contextual features can be explored as more convolution layers are adopted.

View Article and Find Full Text PDF

In this paper, we make the first attempt to study the subjective and objective quality assessment for the screen content videos (SCVs). For that, we construct the first large-scale video quality assessment (VQA) database specifically for the SCVs, called the screen content video database (SCVD). This SCVD provides 16 reference SCVs, 800 distorted SCVs, and their corresponding subjective scores, and it is made publicly available for research usage.

View Article and Find Full Text PDF

In this paper, a progressive collaborative representation (PCR) framework is proposed that is able to incorporate any existing color image demosaicing method for further boosting its demosaicing performance. Our PCR consists of two phases: (i) offline training and (ii) online refinement. In phase (i), multiple training-and-refining stages will be performed.

View Article and Find Full Text PDF

3D point clouds associated with attributes are considered as a promising paradigm for immersive communication. However, the corresponding compression schemes for this media are still in the infant stage. Moreover, in contrast to conventional image/video compression, it is a more challenging task to compress 3D point cloud data, arising from the irregular structure.

View Article and Find Full Text PDF

In this paper, an accurate and efficient full-reference image quality assessment (IQA) model using the extracted Gabor features, called Gabor feature-based model (GFM), is proposed for conducting objective evaluation of screen content images (SCIs). It is well-known that the Gabor filters are highly consistent with the response of the human visual system (HVS), and the HVS is highly sensitive to the edge information. Based on these facts, the imaginary part of the Gabor filter that has odd symmetry and yields edge detection is exploited to the luminance of the reference and distorted SCI for extracting their Gabor features, respectively.

View Article and Find Full Text PDF
Blurriness-Guided Unsharp Masking.

IEEE Trans Image Process

September 2018

In this paper, a highly-adaptive unsharp masking (UM) method is proposed and called the blurriness-guided UM, or BUM, in short. The proposed BUM exploits the estimated local blurriness as the guidance information to perform pixel-wise enhancement. The consideration of local blurriness is motivated by the fact that enhancing a highly-sharp or a highly-blurred image region is undesirable, since this could easily yield unpleasant image artifacts due to over-enhancement or noise enhancement, respectively.

View Article and Find Full Text PDF

In this paper, an accurate full-reference image quality assessment (IQA) model developed for assessing screen content images (SCIs), called the edge similarity (ESIM), is proposed. It is inspired by the fact that the human visual system (HVS) is highly sensitive to edges that are often encountered in SCIs; therefore, essential edge features are extracted and exploited for conducting IQA for the SCIs. The key novelty of the proposed ESIM lies in the extraction and use of three salient edge features-i.

View Article and Find Full Text PDF

A recently developed demosaicing methodology, called residual interpolation (RI), has demonstrated superior performance over the conventional color-component difference interpolation. However, it has been observed that the existing RI-based methods fail to fully exploit the potential of RI strategy on the reconstruction of the most important G channel, as only the R and B channels are restored through the RI strategy. Since any reconstruction error introduced in the G channel will be carried over into the demosaicing process of the other two channels, this makes the restoration of the G channel highly instrumental to the quality of the final demosaiced image.

View Article and Find Full Text PDF

A directed graph (or digraph) approach is proposed in this paper for identifying all the visual objects commonly presented in the two images under comparison. As a model, the directed graph is superior to the undirected graph, since there are two link weights with opposite orientations associated with each link of the graph. However, it inevitably draws two main challenges: 1) how to compute the two link weights for each link and 2) how to extract the subgraph from the digraph.

View Article and Find Full Text PDF
Contrast-guided image interpolation.

IEEE Trans Image Process

November 2013

In this paper a contrast-guided image interpolation method is proposed that incorporates contrast information into the image interpolation process. Given the image under interpolation, four binary contrast-guided decision maps (CDMs) are generated and used to guide the interpolation filtering through two sequential stages: 1) the 45(°) and 135(°) CDMs for interpolating the diagonal pixels and 2) the 0(°) and 90(°) CDMs for interpolating the row and column pixels. After applying edge detection to the input image, the generation of a CDM lies in evaluating those nearby non-edge pixels of each detected edge for re-classifying them possibly as edge pixels.

View Article and Find Full Text PDF

Curve smoothing has two important applications in computer vision and image processing: 1) the curvature scale-space (CSS) technique for shape analysis, and 2) the Gaussian filter for noise suppression. In this paper, we study how planar curves converge as they are smoothed with increasing scales. First, two types of convergence behavior are clarified.

View Article and Find Full Text PDF

The curvature scale-space (CSS) technique is suitable for extracting curvature features from objects with noisy boundaries. To detect corner points in a multiscale framework, Rattarangsi and Chin investigated the scale-space behavior of planar-curve corners. Unfortunately, their investigation was based on an incorrect assumption, viz.

View Article and Find Full Text PDF

In this paper, we propose a novel and simple fast block-matching algorithm (BMA), called adaptive rood pattern search (ARPS), which consists of two sequential search stages: 1) initial search and 2) refined local search. For each macroblock (MB), the initial search is performed only once at the beginning in order to find a good starting point for the follow-up refined local search. By doing so, unnecessary intermediate search and the risk of being trapped into local minimum matching error points could be greatly reduced in long search case.

View Article and Find Full Text PDF

A conventional color histogram (CCH) considers neither the color similarity across different bins nor the color dissimilarity in the same bin. Therefore, it is sensitive to noisy interference such as illumination changes and quantization errors. Furthermore, CCHs large dimension or histogram bins requires large computation on histogram comparison.

View Article and Find Full Text PDF

A novel error concealment algorithm based on a stochastic modeling approach is proposed as a post-processing tool at the decoder side for recovering the lost information incurred during the transmission of encoded digital video bitstreams. In our proposed scheme, both the spatial and the temporal contextual features in video signals are separately modeled using the multiscale Markov random field (MMRF). The lost information is then estimated using maximum a posteriori (MAP) probabilistic approach based on the spatial and temporal MMRF models; hence, a unified MMRF-MAP framework.

View Article and Find Full Text PDF

In this paper, the design of the error resilient time-domain lapped transform is formulated as a linear minimal mean-squared error problem. The optimal Wiener solution and several simplifications with different tradeoffs between complexity and performance are developed. We also prove the persymmetric structure of these Wiener filters.

View Article and Find Full Text PDF

It has been well established that critically sampled boundary pre-/postfiltering operators can improve the coding efficiency and mitigate blocking artifacts in traditional discrete cosine transform-based block coders at low bit rates. In these systems, both the prefilter and the postfilter are square matrices. This paper proposes to use undersampled boundary pre- and postfiltering modules, where the pre-/postfilters are rectangular matrices.

View Article and Find Full Text PDF

A novel switching median filter incorporating with a powerful impulse noise detection method, called the boundary discriminative noise detection (BDND), is proposed in this paper for effectively denoising extremely corrupted images. To determine whether the current pixel is corrupted, the proposed BDND algorithm first classifies the pixels of a localized window, centering on the current pixel, into three groups--lower intensity impulse noise, uncorrupted pixels, and higher intensity impulse noise. The center pixel will then be considered as "uncorrupted," provided that it belongs to the "uncorrupted" pixel group, or "corrupted.

View Article and Find Full Text PDF

Forward error correction based multiple description (MD-FEC) transcoding for transmitting embedded bitstream over the packet erasure networks has been extensively studied in the past. In the existing work, a single embedded source bitstream, e.g.

View Article and Find Full Text PDF

In this paper, we investigate the problem of transmitting embedded encoded object-oriented images over the packet-erasure networks. After giving a review of the existing combined unequal error protection (CUEP) and individual unequal error protection (IUEP) schemes, a novel weighted unequal error protection (WUEP) packetization scheme is proposed, which serves as an alternative to the existing methods. In our proposed framework, the embedded bitstreams of all concerned image objects are packetized into multiple description packet streams before transmission.

View Article and Find Full Text PDF