Publications by authors named "Xiongkuo Min"

The analysis and prediction of visual attention have long been crucial tasks in the fields of computer vision and image processing. In practical applications, images are generally accompanied by various text descriptions, however, few studies have explored the influence of text descriptions on visual attention, let alone developed visual saliency prediction models considering text guidance. In this paper, we conduct a comprehensive study on text-guided image saliency (TIS) from both subjective and objective perspectives.

View Article and Find Full Text PDF

Aim: To map the commonly used quantitative blood loss measurement methods in clinical practice and provide a solid foundation for future studies.

Design And Method: This study adhered to the JBI methodology for scoping reviews and preferred reporting items for systematic reviews and meta-analyses extension for scoping reviews. We conducted a literature search using five databases to retrieve articles published between January 2012 and September 2022.

View Article and Find Full Text PDF

Blind video quality assessment (BVQA) plays an indispensable role in monitoring and improving the end-users' viewing experience in various real-world video-enabled media applications. As an experimental field, the improvements of BVQA models have been measured primarily on a few human-rated VQA datasets. Thus, it is crucial to gain a better understanding of existing VQA datasets in order to properly evaluate the current progress in BVQA.

View Article and Find Full Text PDF

In recent years, User Generated Content (UGC) has grown dramatically in video sharing applications. It is necessary for service-providers to use video quality assessment (VQA) to monitor and control users' Quality of Experience when watching UGC videos. However, most existing UGC VQA studies only focus on the visual distortions of videos, ignoring that the perceptual quality also depends on the accompanying audio signals.

View Article and Find Full Text PDF

The high-quality pathological microscopic images are essential for physicians or pathologists to make a correct diagnosis. Image quality assessment (IQA) can quantify the visual distortion degree of images and guide the imaging system to improve image quality, thus raising the quality of pathological microscopic images. Current IQA methods are not ideal for pathological microscopy images due to their specificity.

View Article and Find Full Text PDF

With the popularity of mobile Internet, audio and video (A/V) have become the main way for people to entertain and socialize daily. However, in order to reduce the cost of media storage and transmission, A/V signals will be compressed by service providers before they are transmitted to end-users, which inevitably causes distortions in the A/V signals and degrades the end-user's Quality of Experience (QoE). This motivates us to research the objective audio-visual quality assessment (AVQA).

View Article and Find Full Text PDF

With the development of multimedia technology, Augmented Reality (AR) has become a promising next-generation mobile platform. The primary value of AR is to promote the fusion of digital contents and real-world environments, however, studies on how this fusion will influence the Quality of Experience (QoE) of these two components are lacking. To achieve better QoE of AR, whose two layers are influenced by each other, it is important to evaluate its perceptual quality first.

View Article and Find Full Text PDF

Existing no-reference (NR) image quality assessment (IQA) metrics are still not convincing for evaluating the quality of the camera-captured images. Toward tackling this issue, we, in this article, establish a novel NR quality metric for quantifying the quality of the camera-captured images reliably. Since the image quality is hierarchically perceived from the low-level preliminary visual perception to the high-level semantic comprehension in the human brain, in our proposed metric, we characterize the image quality by exploiting both the low-level image properties and the high-level semantics of the image.

View Article and Find Full Text PDF

In the era of multimedia and Internet, the quick response (QR) code helps people obtain information from offline to online quickly. However, the QR code is often limited in many scenarios because of its random and dull appearance. Therefore, this article proposes a novel approach to embed hyperlinks into common images, making the hyperlinks invisible for human eyes but detectable for mobile devices equipped with a camera.

View Article and Find Full Text PDF

Deep neural networks are vulnerable to adversarial attacks. More importantly, some adversarial examples crafted against an ensemble of source models transfer to other target models and, thus, pose a security threat to black-box applications (when attackers have no access to the target models). Current transfer-based ensemble attacks, however, only consider a limited number of source models to craft an adversarial example and, thus, obtain poor transferability.

View Article and Find Full Text PDF

Virtual viewpoints synthesis is an essential process for many immersive applications including Free-viewpoint TV (FTV). A widely used technique for viewpoints synthesis is Depth-Image-Based-Rendering (DIBR) technique. However, such technique may introduce challenging non-uniform spatial-temporal structure-related distortions.

View Article and Find Full Text PDF

Video frame interpolation aims to improve users' watching experiences by generating high-frame-rate videos from low-frame-rate ones. Existing approaches typically focus on synthesizing intermediate frames using high-quality reference images. However, the captured reference frames may suffer from inevitable spatial degradations such as motion blur, sensor noise, etc.

View Article and Find Full Text PDF

The topics of visual and audio quality assessment (QA) have been widely researched for decades, yet nearly all of this prior work has focused only on single-mode visual or audio signals. However, visual signals rarely are presented without accompanying audio, including heavy-bandwidth video streaming applications. Moreover, the distortions that may separately (or conjointly) afflict the visual and audio signals collectively shape user-perceived quality of experience (QoE).

View Article and Find Full Text PDF

Audio information has been bypassed by most of current visual attention prediction studies. However, sound could have influence on visual attention and such influence has been widely investigated and proofed by many psychological studies. In this paper, we propose a novel multi-modal saliency (MMS) model for videos containing scenes with high audio-visual correspondence.

View Article and Find Full Text PDF

Owning to the recorded light ray distributions, light field contains much richer information and provides possibilities of some enlightening applications, and it has becoming more and more popular. To facilitate the relevant applications, many light field processing techniques have been proposed recently. These operations also bring the loss of visual quality, and thus there is need of a light field quality metric to quantify the visual quality loss.

View Article and Find Full Text PDF

Purpose: The aim of the present study was to redetermine the position of the key points (skeletal marker points) in the damaged female and male jaws to improve the accuracy of jaw reconstruction.

Materials And Methods: To develop a personalized jaw reconstruction guidance program for each patient, we first made 3 statistics to compare the gender differences in the jaw. Next, we proposed and compared 3 methods to use to restore the key skeletal marker points of the damaged jaw according to our statistics.

View Article and Find Full Text PDF

Background: Head-mounted displays (HMDs) and virtual reality (VR) have been frequently used in recent years, and a user's experience and computation efficiency could be assessed by mounting eye-trackers. However, in addition to visually induced motion sickness (VIMS), eye fatigue has increasingly emerged during and after the viewing experience, highlighting the necessity of quantitatively assessment of the detrimental effects. As no measurement method for the eye fatigue caused by HMDs has been widely accepted, we detected parameters related to optometry test.

View Article and Find Full Text PDF

Data size is the bottleneck for developing deep saliency models, because collecting eye-movement data is very time-consuming and expensive. Most of current studies on human attention and saliency modeling have used high-quality stereotype stimuli. In real world, however, captured images undergo various types of transformations.

View Article and Find Full Text PDF

Purpose: For severe mandibular or maxillary defects across the midline, doctors often lack data on the shape of the jaws when designing virtual surgery. This study sought to repair the personalized 3-dimensional shape of the jaw, particularly when the jaw is severely damaged.

Materials And Methods: Two linear regression methods, denoted method I and method II, were used to reconstruct key points of the severely damaged maxilla or mandible based on the remaining jaw.

View Article and Find Full Text PDF

With the quick development and popularity of computers, computer-generated signals have drastically invaded into our daily lives. Screen content image is a typical example, since it also includes graphic and textual images as components as compared with natural scene images which have been deeply explored, and thus screen content image has posed novel challenges to current researches, such as compression, transmission, display, quality assessment, and more. In this paper, we focus our attention on evaluating the quality of screen content images based on the analysis of structural variation, which is caused by compression, transmission, and more.

View Article and Find Full Text PDF

Digital images in the real world are created by a variety of means and have diverse properties. A photographical natural scene image (NSI) may exhibit substantially different characteristics from a computer graphic image (CGI) or a screen content image (SCI). This casts major challenges to objective image quality assessment, for which existing approaches lack effective mechanisms to capture such content type variations, and thus are difficult to generalize from one type to another.

View Article and Find Full Text PDF