Text-line extraction in unconstrained handwritten documents remains a challenging problem due to nonuniform character scale, spatially varying text orientation, and the interference between text lines. In order to address these problems, we propose a new cost function that considers the interactions between text lines and the curvilinearity of each text line. Precisely, we achieve this goal by introducing normalized measures for them, which are based on an estimated line spacing. We also present an optimization method that exploits the properties of our cost function. Experimental results on a database consisting of 853 handwritten Chinese document images have shown that our method achieves a detection rate of 99.52% and an error rate of 0.32%, which outperforms conventional methods.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2011.2166972DOI Listing

Publication Analysis

Top Keywords

text-line extraction
8
handwritten chinese
8
text lines
8
cost function
8
extraction handwritten
4
chinese documents
4
documents based
4
based energy
4
energy minimization
4
minimization framework
4

Similar Publications

Medical tourism among Indonesians: a scoping review.

BMC Health Serv Res

January 2024

Research Centre for Public Health, Equity and Human Flourishing (PHEHF), Torrens University Australia, Adelaide, South Australia, Australia.

Background: International medical travel or medical tourism is not a new phenomenon in many countries, including among Indonesians. Indonesia is reported as a major source of patients from the lower, middle, to upper classes for its neighbouring countries. This scoping review aims to synthesise evidence on supporting factors for Indonesians taking medical tourism and what needs to be improved in Indonesia's health system.

View Article and Find Full Text PDF

Robust PDF Watermarking against Print-Scan Attack.

Sensors (Basel)

August 2023

School of Aeronautics and Astronautics, Zhejiang University, Hangzhou 310027, China.

Portable document format (PDF) files are widely used in file transmission, exchange, and circulation because of their platform independence, small size, good browsing quality, and the ability to place hyperlinks. However, their security issues are also more thorny. It is common to distribute printed PDF files to different groups and individuals after printing.

View Article and Find Full Text PDF

Unconstrained handwritten text recognition is a challenging computer vision task. It is traditionally handled by a two-step approach, combining line segmentation followed by text line recognition. For the first time, we propose an end-to-end segmentation-free architecture for the task of handwritten document recognition: the Document Attention Network.

View Article and Find Full Text PDF

Tamil is an old Indian language with a large corpus of literature on palm leaves, and other constituents. Palm leaf manuscripts were a versatile medium for narrating medicines, literature, theatre, and other subjects. Because of the necessity for digitalization and transcription, recognizing the cursive characters found in palm leaf manuscripts remains an open problem.

View Article and Find Full Text PDF

Writing style is an abstract attribute in handwritten text. It plays an important role in recognition systems and is not easy to define explicitly. Considering the effect of writing style, a writer adaptation method is proposed to transform a writer-independent recognizer toward a particular writer.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!