Comics are a bimodal form of art involving a mixture of text and images. Since comics require a combination of various cognitive processes to comprehend their contents, the analysis of human comic reading behavior sheds light on how humans process such bimodal forms of media. In this paper, we particularly focus on the viewing times of each comic panel as a quantitative measure of attention, and analyze the statistical characteristics of the distributions of comic panel viewing times.
View Article and Find Full Text PDF360° cameras have gained popularity over the last few years. In this paper, we propose two fundamental techniques-Field-of-View IoU (FoV-IoU) and 360Augmentation for object detection in 360° images. Although most object detection neural networks designed for perspective images are applicable to 360° images in equirectangular projection (ERP) format, their performance deteriorates owing to the distortion in ERP images.
View Article and Find Full Text PDFWe propose DistSurf-OF, a novel optical flow method for neuromorphic cameras. Neuromorphic cameras (or event detection cameras) are an emerging sensor modality that makes use of dynamic vision sensors (DVS) to report asynchronously the log-intensity changes (called "events") exceeding a predefined threshold at each pixel. In absence of the intensity value at each pixel location, we introduce a notion of "distance surface"-the distance transform computed from the detected events-as a proxy for object texture.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
May 2020
End-to-end distance metric learning (DML) has been applied to obtain features useful in many computer vision tasks. However, these DML studies have not provided equitable comparisons between features extracted from DML-based networks and softmax-based networks. In this paper, we present objective comparisons between these two approaches under the same network architecture.
View Article and Find Full Text PDFFace hallucination is a technique that reconstructs high-resolution (HR) faces from low-resolution (LR) faces, by using the prior knowledge learned from HR/LR face pairs. Most state-of-the-arts leverage position-patch prior knowledge of the human face to estimate the optimal representation coefficients for each image patch. However, they focus only the position information and usually ignore the context information of the image patch.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
April 2019
In this work, travel destinations and business locations are taken as venues. Discovering a venue by a photograph is very important for visual context-aware applications. Unfortunately, few efforts paid attention to complicated real images such as venue photographs generated by users.
View Article and Find Full Text PDFKnowledge of the human visual system helps to develop better computational models of visual attention. State-of-the-art models have been developed to mimic the visual attention system of young adults that, however, largely ignore the variations that occur with age. In this paper, we investigated how visual scene processing changes with age and we propose an age-adapted framework that helps to develop a computational model that can predict saliency across different age groups.
View Article and Find Full Text PDFIEEE Trans Vis Comput Graph
July 2017
We present DrawFromDrawings, an interactive drawing system that provides users with visual feedback for assistance in 2D drawing using a database of sketch images. Following the traditional imitation and emulation training from art education, DrawFromDrawings enables users to retrieve and refer to a sketch image stored in a database and provides them with various novel strokes as suggestive or deformation feedback. Given regions of interest (ROIs) in the user and reference sketches, DrawFromDrawings detects as-long-as-possible (ALAP) stroke segments and the correspondences between user and reference sketches that are the key to computing seamless interpolations.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
September 2014
Most conventional algorithms for non-Lambertian photometric stereo can be partitioned into two categories. The first category is built upon stable outlier rejection techniques while assuming a dense Lambertian structure for the inliers, and thus performance degrades when general diffuse regions are present. The second utilizes complex reflectance representations and non-linear optimization over pixels to handle non-Lambertian surfaces, but does not explicitly account for shadows or other forms of corrupting outliers.
View Article and Find Full Text PDFBackground: Diabetes self-management education is an essential element of diabetes care. Systems based on information and communication technology (ICT) for supporting lifestyle modification and self-management of diabetes are promising tools for helping patients better cope with diabetes. An earlier study had determined that diet improved and HbA1c declined for the patients who had used DialBetics during a 3-month randomized clinical trial.
View Article and Find Full Text PDFThe health care field is focusing considerable attention on dietary control, which requires that individuals record what they eat. We have developed a novel smartphone application called FoodLog, a multimedia food recording tool that allows users to take photos of their meals and to produce textual food records. Unlike conventional smartphone-based food recording tools, FoodLog allows users to employ meal photos to help them to input textual descriptions based on image retrieval.
View Article and Find Full Text PDFIEEE Trans Image Process
January 2007
This paper presents a novel method for synthesizing a novel view from two sets of differently focused images taken by an aperture camera array for a scene consisting of two approximately constant depths. The proposed method consists of two steps. The first step is a view interpolation to reconstruct an all-in-focus dense light field of the scene.
View Article and Find Full Text PDFIEEE Trans Image Process
November 2005
We present a novel filtering method for reconstructing an all-in-focus image or an arbitrarily focused image from two images that are focused differently. The method can arbitrarily manipulate the degree of blur of the objects using linear filters without segmentation. The filters are uniquely determined from a linear imaging model in the Fourier domain.
View Article and Find Full Text PDF