In the realm of offline handwritten text recognition, numerous normalization algorithms have been developed over the years to serve as preprocessing steps prior to applying automatic recognition models to handwritten text scanned images. These algorithms have demonstrated effectiveness in enhancing the overall performance of recognition architectures. However, many of these methods rely heavily on heuristic strategies that are not seamlessly integrated with the recognition architecture itself. This paper introduces the use of a Pix2Pix trainable model, a specific type of conditional generative adversarial network, as the method to normalize handwritten text images. Also, this algorithm can be seamlessly integrated as the initial stage of any deep learning architecture designed for handwritten recognition tasks. All of this facilitates training the normalization and recognition components as a unified whole, while still maintaining some interpretability of each module. Our proposed normalization approach learns from a blend of heuristic transformations applied to text images, aiming to mitigate the impact of intra-personal handwriting variability among different writers. As a result, it achieves slope and slant normalizations, alongside other conventional preprocessing objectives, such as normalizing the size of text ascenders and descenders. We will demonstrate that the proposed architecture replicates, and in certain cases surpasses, the results of a widely used heuristic algorithm across two metrics and when integrated as the first step of a deep recognition architecture.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11207351 | PMC |
http://dx.doi.org/10.3390/s24123892 | DOI Listing |
Sci Rep
January 2025
Applied Electromagnetic Research Center, National Institute of Information and Communications Technology, Nukui-Kitamachi, Koganei, Tokyo, 184-8795, Japan.
As the demand for computational performance in artificial intelligence (AI) continues to increase, diffractive deep neural networks (DNNs), which can perform AI computing at the speed of light by repeated optical modulation with diffractive optical elements (DOEs), are attracting attention. DOEs are varied in terms of fabrication methods and materials, and among them, volume holographic optical elements (vHOEs) have unique features such as high selectivity and multiplex recordability for wavelength and angle. However, when those are used for DNNs, they suffer from unknown wavefront aberrations compounded by multiple fabrication errors.
View Article and Find Full Text PDFPeerJ Comput Sci
August 2024
School of Computer and Communication, Lanzhou University of Technology, Lanzhou, Gansu, China.
The generator, which combines convolutional neural network (CNN) and Transformer as its core modules, serves as the primary model for the handwriting font generation network and demonstrates effective performance. However, there are still problems with insufficient feature extraction in the overall structure of the font, the thickness of strokes, and the curvature of strokes, resulting in subpar detail in the generated fonts. To solve the problems, we propose a method for constructing a handwritten font generation model based on Pyramid Squeeze Attention, called PSA-HWT.
View Article and Find Full Text PDFSci Data
October 2024
School of Artificial Intelligence, Chongqing University of Technology, Chongqing, 400054, China.
The ancient Yi script has been used for over 8000 years, which can be ranked with Oracle,Sumerian,Egyptian,Mayan and Harappan,and is one of the six ancient scripts in the world. In this article, we collected 2922 handwritten single word samples of commonly used ancient Yi characters. Each character was written by 310 people respectively, with a total of 427,939 valid characters.
View Article and Find Full Text PDFBMC Health Serv Res
October 2024
School of Medicine and Health, Department of Dermatology and Allergy, Technical University of Munich, Biedersteiner Str. 29, Munich, 80802, Germany.
Background: Current digital medical databases record systematically coded diagnoses, but many legacy databases are full of hand-written, free text diagnoses, which can only be meaningfully analysed after mapping them to a coding system. While diagnoses can be extracted from full medical notes with good accuracy, no algorithm using only an unstructured free text diagnosis with no additional data has been published to date.
Objectives/methods: Therefore, we sought to create an algorithm which maps hand-written German diagnoses from our clinical photography database to ICD-10 diagnosis codes, validate its output manually by dermatologists and analyse diagnosis counts over time as a proof-of-concept of its application.
Sensors (Basel)
September 2024
Applied Interactive Multimedia Lab, Engineering Division, New York University Abu Dhabi, Abu Dhabi P.O. Box 129188, United Arab Emirates.
Handwriting style is an important aspect affecting the quality of handwriting. Adhering to one style is crucial for languages that follow cursive orthography and possess multiple handwriting styles, such as Arabic. The majority of available studies analyze Arabic handwriting style from static documents, focusing only on pure styles.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!