Automated dewarping of camera-captured handwritten documents is a challenging research problem in Computer Vision and Pattern Recognition. Most available systems assume the shape of the camera-captured image boundaries to be anywhere between trapezoidal and octahedral, with linear distortion in areas between the boundaries for dewarping. The majority of the state-of-the-art applications successfully dewarp the simple-to-medium range geometrical distortions with partial selection of control points by a user. The proposed work implements a fully automated technique for control point detection from simple-to-complex geometrical distortions in camera-captured document images. The input image is subject to preprocessing, corner point detection, document map generation, and rendering of the de-warped document image. The proposed algorithm has been tested on five different camera-captured document datasets (one internal and four external publicly available) consisting of 958 images. Both quantitative and qualitative evaluations have been performed to test the efficacy of the proposed system. On the quantitative front, an Intersection Over Union (IoU) score of 0.92, 0.88, and 0.80 for document map generation for low-, medium-, and high-complexity datasets, respectively. Additionally, accuracies of the recognized texts, obtained from a market leading OCR engine, are utilized for quantitative comparative analysis on document images before and after the proposed enhancement. Finally, the qualitative analysis visually establishes the system's reliability by demonstrating improved readability even for severely distorted image samples.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9838515 | PMC |
http://dx.doi.org/10.1007/s10032-022-00425-4 | DOI Listing |
PeerJ Comput Sci
July 2024
Department of AI and Software, Gachon University, Seongnam-si, Republic of South Korea.
Layout analysis is the main component of a typical Document Image Analysis (DIA) system and plays an important role in pre-processing. However, regarding the Pashto language, the document images have not been explored so far. This research, for the first time, examines Pashto text along with graphics and proposes a deep learning-based classifier that can detect Pashto text and graphics per document.
View Article and Find Full Text PDFComput Biol Med
September 2023
Macao Polytechnic University, Macao Special Administrative Region of China. Electronic address:
Automatic interpretation of chest X-ray (CXR) photos taken by smartphones at the same performance level as with digital CXRs is challenging, due to the projective transformation caused by the non-ideal camera position. Existing rectification methods for other camera-captured photos (document photos, license plate photos, etc.) cannot precisely rectify the projective transformation of CXR photos, due to its specific projective transformation type.
View Article and Find Full Text PDFInt J Doc Anal Recognit
January 2023
Mysuru, India Department of Sciences, Amrita School of Physical Sciences, Mysuru, Amrita Vishwa Vidyapeetham.
Automated dewarping of camera-captured handwritten documents is a challenging research problem in Computer Vision and Pattern Recognition. Most available systems assume the shape of the camera-captured image boundaries to be anywhere between trapezoidal and octahedral, with linear distortion in areas between the boundaries for dewarping. The majority of the state-of-the-art applications successfully dewarp the simple-to-medium range geometrical distortions with partial selection of control points by a user.
View Article and Find Full Text PDFJ Opt Soc Am A Opt Image Sci Vis
October 2016
Since distortions in camera-captured document images significantly affect the accuracy of optical character recognition (OCR), distortion removal plays a critical role for document digitalization systems using a camera for image capturing. This paper proposes a novel framework that performs three-dimensional (3D) reconstruction and rectification of camera-captured document images. While most existing methods rely on additional calibrated hardware or multiple images to recover the 3D shape of a document page, or make a simple but not always valid assumption on the corresponding 3D shape, our framework is more flexible and practical since it only requires a single input image and is able to handle a general locally smooth document surface.
View Article and Find Full Text PDFIEEE Trans Image Process
November 2016
Camera-based text processing has attracted considerable attention and numerous methods have been proposed. However, most of these methods have focused on the scene text detection problem and relatively little work has been performed on camera-captured document images. In this paper, we present a text-line detection algorithm for camera-captured document images, which is an essential step toward document understanding.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!