Adaptive dewarping of severely warped camera-captured document images based on document map generation.

Int J Doc Anal Recognit

Mysuru, India Department of Sciences, Amrita School of Physical Sciences, Mysuru, Amrita Vishwa Vidyapeetham.

Published: January 2023

Automated dewarping of camera-captured handwritten documents is a challenging research problem in Computer Vision and Pattern Recognition. Most available systems assume the shape of the camera-captured image boundaries to be anywhere between trapezoidal and octahedral, with linear distortion in areas between the boundaries for dewarping. The majority of the state-of-the-art applications successfully dewarp the simple-to-medium range geometrical distortions with partial selection of control points by a user. The proposed work implements a fully automated technique for control point detection from simple-to-complex geometrical distortions in camera-captured document images. The input image is subject to preprocessing, corner point detection, document map generation, and rendering of the de-warped document image. The proposed algorithm has been tested on five different camera-captured document datasets (one internal and four external publicly available) consisting of 958 images. Both quantitative and qualitative evaluations have been performed to test the efficacy of the proposed system. On the quantitative front, an Intersection Over Union (IoU) score of 0.92, 0.88, and 0.80 for document map generation for low-, medium-, and high-complexity datasets, respectively. Additionally, accuracies of the recognized texts, obtained from a market leading OCR engine, are utilized for quantitative comparative analysis on document images before and after the proposed enhancement. Finally, the qualitative analysis visually establishes the system's reliability by demonstrating improved readability even for severely distorted image samples.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9838515PMC
http://dx.doi.org/10.1007/s10032-022-00425-4DOI Listing

Publication Analysis

Top Keywords

camera-captured document
12
document images
12
document map
12
map generation
12
document
8
geometrical distortions
8
point detection
8
camera-captured
5
adaptive dewarping
4
dewarping severely
4

Similar Publications

Layout analysis is the main component of a typical Document Image Analysis (DIA) system and plays an important role in pre-processing. However, regarding the Pashto language, the document images have not been explored so far. This research, for the first time, examines Pashto text along with graphics and proposes a deep learning-based classifier that can detect Pashto text and graphics per document.

View Article and Find Full Text PDF

Automatic interpretation of chest X-ray (CXR) photos taken by smartphones at the same performance level as with digital CXRs is challenging, due to the projective transformation caused by the non-ideal camera position. Existing rectification methods for other camera-captured photos (document photos, license plate photos, etc.) cannot precisely rectify the projective transformation of CXR photos, due to its specific projective transformation type.

View Article and Find Full Text PDF

Automated dewarping of camera-captured handwritten documents is a challenging research problem in Computer Vision and Pattern Recognition. Most available systems assume the shape of the camera-captured image boundaries to be anywhere between trapezoidal and octahedral, with linear distortion in areas between the boundaries for dewarping. The majority of the state-of-the-art applications successfully dewarp the simple-to-medium range geometrical distortions with partial selection of control points by a user.

View Article and Find Full Text PDF

Since distortions in camera-captured document images significantly affect the accuracy of optical character recognition (OCR), distortion removal plays a critical role for document digitalization systems using a camera for image capturing. This paper proposes a novel framework that performs three-dimensional (3D) reconstruction and rectification of camera-captured document images. While most existing methods rely on additional calibrated hardware or multiple images to recover the 3D shape of a document page, or make a simple but not always valid assumption on the corresponding 3D shape, our framework is more flexible and practical since it only requires a single input image and is able to handle a general locally smooth document surface.

View Article and Find Full Text PDF

Camera-based text processing has attracted considerable attention and numerous methods have been proposed. However, most of these methods have focused on the scene text detection problem and relatively little work has been performed on camera-captured document images. In this paper, we present a text-line detection algorithm for camera-captured document images, which is an essential step toward document understanding.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!