Publications by authors named "Yong Man Ro"

Visual Speech Recognition (VSR) aims to infer speech into text depending on lip movements alone. As it focuses on visual information to model the speech, its performance is inherently sensitive to personal lip appearances and movements, and this makes the VSR models show degraded performance when they are applied to unseen speakers. In this paper, to remedy the performance degradation of the VSR model on unseen speakers, we propose prompt tuning methods of Deep Neural Networks (DNNs) for speaker-adaptive VSR.

View Article and Find Full Text PDF

When humans hear the sound of an object, they recall associated visual information and integrate the sound with recalled visual modality to detect the object. In this article, we present a novel sound-based object detector that mimics this process. We design a visual modality recalling (VMR) memory to recall information of a visual modality based on an audio modal input (i.

View Article and Find Full Text PDF

Introduction: Infectious keratitis is a vision threatening disease. Bacterial and fungal keratitis are often confused in the early stages, so right diagnosis and optimized treatment for causative organisms is crucial. Antibacterial and antifungal medications are completely different, and the prognosis for fungal keratitis is even much worse.

View Article and Find Full Text PDF

Monocular 3D object detection has drawn increasing attention in various human-related applications, such as autonomous vehicles, due to its cost-effective property. On the other hand, a monocular image alone inherently contains insufficient information to infer the 3D information. In this paper, we propose a new monocular 3D object detector that can recall the stereoscopic visual information about an object, given a left-view monocular image.

View Article and Find Full Text PDF

Recent works have demonstrated that deep neural networks (DNNs) are highly vulnerable to adversarial attacks. To defend against adversarial attacks, many defense strategies have been proposed, among which adversarial training (AT) has been demonstrated to be the most effective strategy. However, it has been known that AT sometimes hurts natural accuracy.

View Article and Find Full Text PDF

Person detection has attracted great attention in the computer vision area and is an imperative element in human-centric computer vision. Although the predictive performances of person detection networks have been improved dramatically, they are vulnerable to adversarial patch attacks. Changing the pixels in a restricted region can easily fool the person detection network in safety-critical applications such as autonomous driving and security systems.

View Article and Find Full Text PDF

Along with the outstanding performance of the deep neural networks (DNNs), considerable research efforts have been devoted to finding ways to understand the decision of DNNs structures. In the computer vision domain, visualizing the attribution map is one of the most intuitive and understandable ways to achieve human-level interpretation. Among them, perturbation-based visualization can explain the "black box" property of the given network by optimizing perturbation masks that alter the network prediction of the target class the most.

View Article and Find Full Text PDF

Abnormal event detection is an important task in video surveillance systems. In this paper, we propose a novel bidirectional multi-scale aggregation networks (BMAN) for abnormal event detection. The proposed BMAN learns spatiotemporal patterns of normal events to detect deviations from the learned normal patterns as abnormalities.

View Article and Find Full Text PDF

Purpose: Transvaginal ultrasound imaging provides useful information for diagnosing endometrial pathologies and reproductive health. Endometrium segmentation in transvaginal ultrasound (TVUS) images is very challenging due to ambiguous boundaries and heterogeneous textures. In this study, we developed a new segmentation framework which provides robust segmentation against ambiguous boundaries and heterogeneous textures of TVUS images.

View Article and Find Full Text PDF

Recently, deep learning technology has achieved various successes in medical image analysis studies including computer-aided diagnosis (CADx). However, current CADx approaches based on deep learning have a limitation in interpreting diagnostic decisions. The limited interpretability is a major challenge for practical use of current deep learning approaches.

View Article and Find Full Text PDF

The viewing safety is one of the main issues in viewing virtual reality (VR) content. In particular, VR sickness could occur when watching immersive VR content. To deal with the viewing safety for VR content, objective assessment of VR sickness is of great importance.

View Article and Find Full Text PDF

In this paper, we propose a new ultrafast layer based CGH calculation that exploits the sparsity of hologram fringe pattern in 3-D object layer. Specifically, we devise a sparse template holographic fringe pattern. The holographic fringe pattern on a depth layer can be rapidly calculated by adding the sparse template holographic fringe patterns at each object point position.

View Article and Find Full Text PDF

In this paper, we propose a novel medical image segmentation using iterative deep learning framework. We have combined an iterative learning approach and an encoder-decoder network to improve segmentation results, which enables to precisely localize the regions of interest (ROIs) including complex shapes or detailed textures of medical images in an iterative manner. The proposed iterative deep convolutional encoder-decoder network consists of two main paths: convolutional encoder path and convolutional decoder path with iterative learning.

View Article and Find Full Text PDF

Characterization of masses in computer-aided detection systems for digital breast tomosynthesis (DBT) is an important step to reduce false positive (FP) rates. To effectively differentiate masses from FPs in DBT, discriminative mass feature representation is required. In this paper, we propose a new latent feature representation boosted by depth directional long-term recurrent learning for characterizing malignant masses.

View Article and Find Full Text PDF

In computer-generated hologram (CGH) calculations, a diffraction pattern needs to be calculated from all points of a 3-D object, which requires a heavy computational cost. In this paper, we propose a novel fast computer-generated hologram calculation method using sparse fast Fourier transform. The proposed method consists of two steps.

View Article and Find Full Text PDF

Stereoscopic images could have asymmetric distortions caused by image processing in capture, synthesis, and compression of them. In 3D perception in stereoscopic display, the visibility threshold of the asymmetric distortions in the left and right images is important, which is tolerable to the human visual system. In this paper, we investigate the effect of the binocular disparity on the visibility threshold of asymmetric noises in stereoscopic images via subjective assessments.

View Article and Find Full Text PDF
Article Synopsis
  • This study aims to create a computer-aided detection system that enhances the accuracy of finding masses in 3D digital breast tomosynthesis (DBT) by using data from both 3D DBT images and simulated 2D projections.
  • The method involves generating a clearer simulated projection by measuring blurriness in the DBT volume and applying a detection algorithm to find mass candidates, which are then analyzed using a Bayesian network to distinguish true masses from false positives.
  • The researchers tested their system with a dataset of 320 DBT volumes, assessing mass detection accuracy through various image quality metrics and a free-response receiver operating characteristic (FROC) analysis. *
View Article and Find Full Text PDF

In digital breast tomosynthesis (DBT), image characteristics of projection views and reconstructed volume are different and both have the advantage of detecting breast masses, e.g. reconstructed volume mitigates a tissue overlap, while projection views have less reconstruction blur artifacts.

View Article and Find Full Text PDF

In this paper, a new method is developed for extracting so-called region-based stellate features to correctly differentiate spiculated malignant masses from normal tissues on mammograms. In the proposed method, a given region of interest (ROI) for feature extraction is divided into three individual subregions, namely core, inner, and outer parts. The proposed region-based stellate features are then extracted to encode the different and complementary stellate pattern information by computing the statistical characteristics for each of the three different subregions.

View Article and Find Full Text PDF

In digital breast tomosynthesis, the three dimensional (3D) reconstructed volumes only provide quasi-3D structure information with limited resolution along the depth direction due to insufficient sampling in depth direction and the limited angular range. The limitation could seriously hamper the conventional 3D image analysis techniques for detecting masses because the limited number of projection views causes blurring in the out-of-focus planes. In this paper, we propose a novel mass detection approach using slice conspicuity in the 3D reconstructed digital breast volumes to overcome the above limitation.

View Article and Find Full Text PDF

We propose a novel computer-aided detection (CAD) framework of breast masses in mammography. To increase detection sensitivity for various types of mammographic masses, we propose the combined use of different detection algorithms. In particular, we develop a region-of-interest combination mechanism that integrates detection information gained from unsupervised and supervised detection algorithms.

View Article and Find Full Text PDF

Stereoscopic displays provide viewers with a truly fascinating viewing experience. However, current stereoscopic displays suffer from crosstalk that is detrimental to image quality, depth quality, and visual comfort. In order to reduce the perceived crosstalk in stereoscopic displays, this paper proposes a crosstalk reduction method that combines disparity adjustment and crosstalk cancellation.

View Article and Find Full Text PDF

Background: Breast cancer is the leading cause of both incidence and mortality in women population. For this reason, much research effort has been devoted to develop Computer-Aided Detection (CAD) systems for early detection of the breast cancers on mammograms. In this paper, we propose a new and novel dictionary configuration underpinning sparse representation based classification (SRC).

View Article and Find Full Text PDF

In this paper, a new 3D ultrasound (US) denoising technique that adopts the sparse representation has been proposed for an effective noise reduction in 3D US volumes. The purpose of the proposed method is to reduce image noise while preserving 3D objects edges, hence improving the human interpretation for clinical diagnosis and the 3D segmentation accuracy for further automatic malignancy detection. For denoising 3D US volumes, sparse representation was employed, which has showed an excellent performance in reducing Gaussian noise.

View Article and Find Full Text PDF

One of the drawbacks of current Computer-aided Detection (CADe) systems is a high number of false-positive (FP) detections, especially for detecting mass abnormalities. In a typical CADe system, classifier design is one of the key steps for determining FP detection rates. This paper presents the effective classifier ensemble system for tackling FP reduction problem in CADe.

View Article and Find Full Text PDF