IEEE Trans Med Imaging
July 2023
Federated Learning (FL) is a machine learning paradigm where many local nodes collaboratively train a central model while keeping the training data decentralized. This is particularly relevant for clinical applications since patient data are usually not allowed to be transferred out of medical facilities, leading to the need for FL. Existing FL methods typically share model parameters or employ co-distillation to address the issue of unbalanced data distribution.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
April 2023
Weakly-supervised temporal action localization (W-TAL) aims to classify and localize all action instances in untrimmed videos under only video-level supervision. Without frame-level annotations, it is challenging for W-TAL methods to clearly distinguish actions and background, which severely degrades the action boundary localization and action proposal scoring. In this paper, we present an adaptive two-stream consensus network (A-TSCN) to address this problem.
View Article and Find Full Text PDFThe use of face masks is an important way to fight the COVID-19 pandemic. In this paper, we envision the Smart Mask, an IoT supported platform and ecosystem aiming to prevent and control the spreading of COVID-19 and other respiratory viruses. The integration of sensing, materials, AI, wireless, IoT, and software will help the gathering of health data and health-related event detection in real time from the user as well as from their environment.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
November 2022
Modern convolutional neural network (CNN)-based object detectors focus on feature configuration during training but often ignore feature optimization during inference. In this article, we propose a new feature optimization approach to enhance features and suppress background noise in both the training and inference stages. We introduce a generic inference-aware feature filtering (IFF) module that can be easily combined with existing detectors, resulting in our iffDetector.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
November 2021
Charts are useful communication tools for the presentation of data in a visually appealing format that facilitates comprehension. There have been many studies dedicated to chart mining, which refers to the process of automatic detection, extraction and analysis of charts to reproduce the tabular data that was originally used to create them. By allowing access to data which might not be available in other formats, chart mining facilitates the creation of many downstream applications.
View Article and Find Full Text PDFIEEE Trans Image Process
October 2018
Document image binarization classifies each pixel in an input document image as either foreground or background under the assumption that the document is pseudo binary in nature. However, noise introduced during acquisition or due to aging or handling of the document can make binarization a challenging task. This paper presents a novel game theory inspired binarization technique for degraded document images.
View Article and Find Full Text PDFIEEE Trans Image Process
September 2016
Blind image quality assessment (BIQA) research aims to develop a perceptual model to evaluate the quality of distorted images automatically and accurately without access to the non-distorted reference images. The state-of-the-art general purpose BIQA methods can be classified into two categories according to the types of features used. The first includes handcrafted features which rely on the statistical regularities of natural images.
View Article and Find Full Text PDFThis paper analyzes, compares, and contrasts technical challenges, methods, and the performance of text detection and recognition research in color imagery. It summarizes the fundamental problems and enumerates factors that should be considered when addressing these problems. Existing techniques are categorized as either stepwise or integrated and sub-problems are highlighted including text localization, verification, segmentation and recognition.
View Article and Find Full Text PDFThe goal of no-reference objective image quality assessment (NR-IQA) is to develop a computational model that can predict the human-perceived quality of distorted images accurately and automatically without any prior knowledge of reference images. Most existing NR-IQA approaches are distortion specific and are typically limited to one or two specific types of distortions. In most practical applications, however, information about the distortion type is not really available.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
November 2009
As one of the most pervasive methods of individual identification and document authentication, signatures present convincing evidence and provide an important form of indexing for effective document image processing and retrieval in a broad range of applications. However, detection and segmentation of free-form objects such as signatures from clustered background is currently an open document analysis problem. In this paper, we focus on two fundamental problems in signature-based document image retrieval.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
February 2009
Resolution of different types of loops in handwritten script presents a difficult task and is an important step in many classic word recognition systems, writer modeling, and signature verification. When processing a handwritten script, a great deal of ambiguity occurs when strokes overlap, merge, or intersect. This paper presents a novel loop modeling and contour-based handwriting analysis that improves loop investigation.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
August 2008
Text line segmentation in freestyle handwritten documents remains an open document analysis problem. Curvilinear text lines and small gaps between neighboring text lines present a challenge to algorithms developed for machine printed or hand-printed documents. In this paper, we propose a novel approach based on density estimation and a state-of-the-art image segmentation technique, the level set method.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
April 2008
Compared to typical scanners, handheld cameras offer convenient, flexible, portable, and non-contact image capture, which enables many new applications and breathes new life into existing ones. However, camera-captured documents may suffer from distortions caused by non-planar document shape and perspective projection, which lead to failure of current OCR technologies. We present a geometric rectification framework for restoring the frontal-flat view of a document from a single camera-captured image.
View Article and Find Full Text PDFSymbolic document image compression relies on the detection of similar patterns in a document image and construction of a prototype library. Compression is achieved by referencing multiple pattern instances ("components") through a single representative prototype. To provide a lossless compression, however, the residual difference between each component and its assigned prototype must be coded.
View Article and Find Full Text PDFText that appears in a scene or is graphically added to video can provide an important supplemental source of index information as well as clues for decoding the video's structure and for classification. In this work, we present algorithms for detecting and tracking text in digital video. Our system implements a scale-space feature extractor that feeds an artificial neural processor to detect text blocks.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
April 2006
In previous work on point matching, a set of points is often treated as an instance of a joint distribution to exploit global relationships in the point set. For nonrigid shapes, however, the local relationship among neighboring points is stronger and more stable than the global one. In this paper, we introduce the lotion of a neighborhood structure for the general point matching problem.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
May 2005
The detection of groups of parallel lines is important in applications such as form processing and text (handwriting) extraction from rule lined paper. These tasks can be very challenging in degraded documents where the lines are severely broken. In this paper, we propose a novel model-based method which incorporates high-level context to detect these lines.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
March 2004
In this paper, we address the problem of the identification of text in noisy document images. We are especially focused on segmenting and identifying between handwriting and machine printed text because: 1) Handwriting in a document often indicates corrections, additions, or other supplemental information that should be treated differently from the main content and 2) the segmentation and recognition techniques requested for machine printed and handwritten text are significantly different. A novel aspect of our approach is that we treat noise as a separate class and model noise based on selected features.
View Article and Find Full Text PDF