IEEE Trans Image Process
October 2024
We study the visual quality judgments of human subjects on digital human avatars (sometimes referred to as "holograms" in the parlance of virtual reality [VR] and augmented reality [AR] systems) that have been subjected to distortions. We also study the ability of video quality models to predict human judgments. As streaming human avatar videos in VR or AR become increasingly common, the need for more advanced human avatar video compression protocols will be required to address the tradeoffs between faithfully transmitting high-quality visual representations while adjusting to changeable bandwidth scenarios.
View Article and Find Full Text PDFThe Visual Multimethod Assessment Fusion (VMAF) algorithm has recently emerged as a state-of-the-art approach to video quality prediction, that now pervades the streaming and social media industry. However, since VMAF requires the evaluation of a heterogeneous set of quality models, it is computationally expensive. Given other advances in hardware-accelerated encoding, quality assessment is emerging as a significant bottleneck in video compression pipelines.
View Article and Find Full Text PDFIEEE Trans Image Process
June 2023
We present the outcomes of a recent large-scale subjective study of Mobile Cloud Gaming Video Quality Assessment (MCG-VQA) on a diverse set of gaming videos. Rapid advancements in cloud services, faster video encoding technologies, and increased access to high-speed, low-latency wireless internet have all contributed to the exponential growth of the Mobile Cloud Gaming industry. Consequently, the development of methods to assess the quality of real-time video feeds to end-users of cloud gaming platforms has become increasingly important.
View Article and Find Full Text PDFMeasuring Quality of Experience (QoE) and integrating these measurements into video streaming algorithms is a multi-faceted problem that fundamentally requires the design of comprehensive subjective QoE databases and objective QoE prediction models. To achieve this goal, we have recently designed the LIVE-NFLX-II database, a highly-realistic database which contains subjective QoE responses to various design dimensions, such as bitrate adaptation algorithms, network conditions and video content. Our database builds on recent advancements in content-adaptive encoding and incorporates actual network traces to capture realistic network variations on the client device.
View Article and Find Full Text PDFIEEE Trans Image Process
September 2020
Image compression has always been an important topic in the last decades due to the explosive increase of images. The popular image compression formats are based on different transforms which convert images from the spatial domain into compact frequency domain to remove the spatial correlation. In this paper, we focus on the exploration of data-driven transform, Karhunen-Loéve transform (KLT), the kernels of which are derived from specific images via Principal Component Analysis (PCA), and design a high efficient KLT based image compression algorithm with variable transform sizes.
View Article and Find Full Text PDFIEEE Trans Image Process
July 2018
Streaming video services represent a very large fraction of global bandwidth consumption. Due to the exploding demands of mobile video streaming services, coupled with limited bandwidth availability, video streams are often transmitted through unreliable, low-bandwidth networks. This unavoidably leads to two types of major streaming-related impairments: compression artifacts and/or rebuffering events.
View Article and Find Full Text PDFIEEE Trans Image Process
November 2017
HTTP adaptive streaming is being increasingly deployed by network content providers, such as Netflix and YouTube. By dividing video content into data chunks encoded at different bitrates, a client is able to request the appropriate bitrate for the segment to be played next based on the estimated network conditions. However, this can introduce a number of impairments, including compression artifacts and rebuffering events, which can severely impact an end-user's quality of experience (QoE).
View Article and Find Full Text PDFIEEE Trans Cybern
September 2016
In this paper, we propose a complete gesture recognition framework based on maximum cosine similarity and fast nearest neighbor (NN) techniques, which offers high-recognition accuracy and great computational advantages for three fundamental problems of gesture recognition: 1) isolated recognition; 2) gesture verification; and 3) gesture spotting on continuous data streams. To support our arguments, we provide a thorough evaluation on three large publicly available databases, examining various scenarios, such as noisy environments, limited number of training examples, and time delay in system's response. Our experimental results suggest that this simple NN-based approach is quite accurate for trajectory classification of digits and letters and could become a promising approach for implementations on low-power embedded systems.
View Article and Find Full Text PDFMultitoning is the representation of digital pictures using a given set of available color intensities, which are also known as tones or quantization levels. It can be viewed as the generalization of halftoning, where only two such quantization levels are available. Its main application is for printing and, similar to halftoning, can be applied to both colored and grayscale images.
View Article and Find Full Text PDFIEEE Trans Image Process
October 2012
A new digital halftoning technique based on multiscale error diffusion is examined. We use an image quadtree to represent the difference image between the input gray-level image and the output halftone image. In iterative algorithm is developed that searches the brightest region of a given image via "maximum intensity guidance" for assigning dots and diffuses the quantization error noncausally at each iteration.
View Article and Find Full Text PDFIEEE Trans Image Process
October 2012
This work examines the nearest neighbor encoding problem with an unstructured codebook of arbitrary size and vector dimension. We propose a new tree-structured nearest neighbor encoding method that significantly reduces the complexity of the full-search method without any performance degradation in terms of distortion. Our method consists of efficient algorithms for constructing a binary tree for the codebook and nearest neighbor encoding by using this tree.
View Article and Find Full Text PDF