Background: Lugol chromoendoscopy has been shown to increase the sensitivity of detection of esophageal squamous cell carcinoma (ESCC). We aimed to develop a deep learning-based virtual lugol chromoendoscopy (V-LCE) method.
Methods: We developed still V-LCE images for superficial ESCC using a cycle-consistent generative adversarial network (CycleGAN).
Removing raindrops in images has been addressed as a significant task for various computer vision applications. In this paper, we propose the first method using a dual-pixel (DP) sensor to better address raindrop removal. Our key observation is that raindrops attached to a glass window yield noticeable disparities in DP's left-half and right-half images, while almost no disparity exists for in-focus backgrounds.
View Article and Find Full Text PDFBackgrounds: Cycle-consistent generative adversarial network (CycleGAN) is a deep neural network model that performs image-to-image translations. We generated virtual indigo carmine (IC) chromoendoscopy images of gastric neoplasms using CycleGAN and compared their diagnostic performance with that of white light endoscopy (WLE).
Methods: WLE and IC images of 176 patients with gastric neoplasms who underwent endoscopic resection were obtained.
A camera captures multidimensional information of the real world by convolving it into two dimensions using a sensing matrix. The original multidimensional information is then reconstructed from captured images. Traditionally, multidimensional information has been captured by uniform sampling, but by optimizing the sensing matrix, we can capture images more efficiently and reconstruct multidimensional information with high quality.
View Article and Find Full Text PDFA polarization camera has great potential for 3D reconstruction since the angle of polarization (AoP) and the degree of polarization (DoP) of reflected light are related to an object's surface normal. In this paper, we propose a novel 3D reconstruction method called Polarimetric Multi-View Inverse Rendering (Polarimetric MVIR) that effectively exploits geometric, photometric, and polarimetric cues extracted from input multi-view color-polarization images. We first estimate camera poses and an initial 3D model by geometric reconstruction with a standard structure-from-motion and multi-view stereo pipeline.
View Article and Find Full Text PDFAnnu Int Conf IEEE Eng Med Biol Soc
November 2021
Gastroendoscopy has been a clinical standard for diagnosing and treating conditions that affect a part of a patient's digestive system, such as the stomach. Despite the fact that gastroendoscopy has a lot of advantages for patients, there exist some challenges for practitioners, such as the lack of 3D perception, including the depth and the endoscope pose information. Such challenges make navigating the endoscope and localizing any found lesion in a digestive tract difficult.
View Article and Find Full Text PDFIEEE J Transl Eng Health Med
December 2021
Gastric endoscopy is a golden standard in the clinical process that enables medical practitioners to diagnose various lesions inside a patient's stomach. If a lesion is found, a success in identifying the location of the found lesion relative to the global view of the stomach will lead to better decision making for the next clinical treatment. Our previous research showed that the lesion localization could be achieved by reconstructing the whole stomach shape from chromoendoscopic indigo carmine (IC) dye-sprayed images using a structure-from-motion (SfM) pipeline.
View Article and Find Full Text PDFVisual localization enables autonomous vehicles to navigate in their surroundings and augmented reality applications to link virtual to real worlds. Practical visual localization approaches need to be robust to a wide variety of viewing conditions, including day-night changes, as well as weather and seasonal variations, while providing highly accurate six degree-of-freedom (6DOF) camera pose estimates. In this paper, we extend three publicly available datasets containing images captured under a wide variety of viewing conditions, but lacking camera pose information, with ground truth pose information, making evaluation of the impact of various factors on 6DOF camera pose estimation accuracy possible.
View Article and Find Full Text PDFAnnu Int Conf IEEE Eng Med Biol Soc
July 2020
In this paper, we propose a novel video-based remote heart rate (HR) estimation method based on 3D facial landmarks. The key contributions in our method are twofold: (i) We introduce 3D facial landmarks detection to the video-based HR estimation and (ii) we propose a novel face patch visibility check manner based on the face patch normal in the 3D space. We experimentally demonstrate that, compared with baseline methods using 2D facial landmarks, our proposed method using 3D facial landmarks improves the robustness of HR estimation to head rotations and partial face occlusion.
View Article and Find Full Text PDFAnnu Int Conf IEEE Eng Med Biol Soc
July 2020
Gastric endoscopy is a standard clinical process that enables medical practitioners to diagnose various lesions inside a patient's stomach. If any lesion is found, it is very important to perceive the location of the lesion relative to the global view of the stomach. Our previous research showed that this could be addressed by reconstructing the whole stomach shape from chromoendoscopic images using a structure-from-motion (SfM) pipeline, in which indigo carmine (IC) blue dye-sprayed images were used to increase feature matches for SfM by enhancing stomach surface's textures.
View Article and Find Full Text PDFIEEE J Transl Eng Health Med
October 2019
Gastric endoscopy is a common clinical practice that enables medical doctors to diagnose various lesions inside a stomach. In order to identify the location of a gastric lesion such as early cancer and a peptic ulcer within the stomach, this work addresses to reconstruct the color-textured 3D model of a whole stomach from a standard monocular endoscope video and localize any selected video frame to the 3D model. We examine how to enable structure-from-motion (SfM) to reconstruct the whole shape of a stomach from endoscope images, which is a challenging task due to the texture-less nature of the stomach surface.
View Article and Find Full Text PDFAnnu Int Conf IEEE Eng Med Biol Soc
July 2019
Inter-beat interval (IBI) and heart rate variability (HRV) are important cardiac parameters that provide physiological and emotional states of a person. In this paper, we present a framework for accurate IBI and HRV estimation from a facial video based on the reliability of extracted blood volume pulse (BVP) signals. Our framework first extracts candidate BVP signals from randomly sampled multiple face patches.
View Article and Find Full Text PDFAnnu Int Conf IEEE Eng Med Biol Soc
July 2019
Gastric endoscopy is a common clinical practice that enables medical doctors to diagnose the stomach inside a body. In order to identify a gastric lesion's location such as early gastric cancer within the stomach, this work addressed to reconstruct the 3D shape of a whole stomach with color texture information generated from a standard monocular endoscope video. Previous works have tried to reconstruct the 3D structures of various organs from endoscope images.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
April 2021
We seek to predict the 6 degree-of-freedom (6DoF) pose of a query photograph with respect to a large indoor 3D map. The contributions of this work are three-fold. First, we develop a new large-scale visual localization method targeted for indoor spaces.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
March 2021
Accurate visual localization is a key technology for autonomous navigation. 3D structure-based methods employ 3D models of the scene to estimate the full 6 degree-of-freedom (DOF) pose of a camera very accurately. However, constructing (and extending) large-scale 3D models is still a significant challenge.
View Article and Find Full Text PDFAnnu Int Conf IEEE Eng Med Biol Soc
July 2018
In this paper, we propose a novel heart rate (HR) estimation method using simultaneously recorded RGB and near-infrared (NIR) face videos. The key idea of our method is to automatically select suitable face patches for HR estimation in both spatial and spectral domains. The spatial and spectral face patch selection enables us to robustly estimate HR under various situations, including scenes under which existing RGB camera-based methods fail to accurately estimate HR.
View Article and Find Full Text PDFWe propose a coupled convolution layer comprising multiple parallel convolutions with mutually constrained filters. Inspired by biological human vision mechanism, we constrain the convolution filters such that one set of filter weights should be geometrically rotated, mirrored, or be the negative of the other. Our analysis suggests that the coupled convolution layer is more effective for lower layer where feature maps preserve geometric properties.
View Article and Find Full Text PDFColor image demosaicking for the Bayer color filter array is an essential image processing operation for acquiring high-quality color images. Recently, residual interpolation (RI)-based algorithms have demonstrated superior demosaicking performance over conventional color difference interpolation-based algorithms. In this paper, we propose adaptive residual interpolation (ARI) that improves existing RI-based algorithms by adaptively combining two RI-based algorithms and selecting a suitable iteration number at each pixel.
View Article and Find Full Text PDFWe address the problem of large-scale visual place recognition for situations where the scene undergoes a major change in appearance, for example, due to illumination (day/night), change of seasons, aging, or structural modifications over time such as buildings being built or destroyed. Such situations represent a major challenge for current large-scale place recognition methods. This work has the following three principal contributions.
View Article and Find Full Text PDFIEEE Trans Image Process
March 2016
In this paper, we propose residual interpolation (RI) as an alternative to color difference interpolation, which is a widely accepted technique for color image demosaicking. Our proposed RI performs the interpolation in a residual domain, where the residuals are differences between observed and tentatively estimated pixel values. Our hypothesis for the RI is that if image interpolation is performed in a domain with a smaller Laplacian energy, its accuracy is improved.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
November 2015
Repeated structures such as building facades, fences or road markings often represent a significant challenge for place recognition. Repeated structures are notoriously hard for establishing correspondences using multi-view geometry. They violate the feature independence assumed in the bag-of-visual-words representation which often leads to over-counting evidence and significant degradation of retrieval performance.
View Article and Find Full Text PDFIEEE Trans Image Process
October 2015
Single-sensor imaging using the Bayer color filter array (CFA) and demosaicking is well established for current compact and low-cost color digital cameras. An extension from the CFA to a multispectral filter array (MSFA) enables us to acquire a multispectral image in one shot without increased size or cost. However, multispectral demosaicking for the MSFA has been a challenging problem because of very sparse sampling of each spectral band in the MSFA.
View Article and Find Full Text PDFIEEE Trans Image Process
October 2014
The additive white Gaussian noise is widely assumed in many image processing algorithms. However, in the real world, the noise from actual cameras is better modeled as signal-dependent noise (SDN). In this paper, we focus on the SDN model and propose an algorithm to automatically estimate its parameters from a single noisy image.
View Article and Find Full Text PDFNoise level is an important parameter to many image processing applications. For example, the performance of an image denoising algorithm can be much degraded due to the poor noise level estimation. Most existing denoising algorithms simply assume the noise level is known that largely prevents them from practical use.
View Article and Find Full Text PDF