Before the 19th century, all communication and official records relied on handwritten documents, cherished as valuable artefacts by different ethnic groups. While significant efforts have been made to automate the transcription of major languages like English, French, Arabic, and Chinese, there has been less research on regional and minor languages, despite their importance from geographical and historical perspectives. This research focuses on detecting and recognizing Pashto handwritten characters and ligatures, which is essential for preserving this regional cursive language in Pakistan and its status as the national language of Afghanistan. Deep learning techniques were employed to detect and recognize Pashto characters and ligatures, utilizing a newly developed dataset specific to Pashto. A further enhancement was done on the dataset by implementing data augmentation, i.e., scaling and rotation on Pashto handwritten characters and ligatures, which gave us many variations of a single trajectory. Different morphological operations for minimizing gaps in the trajectories were also performed. The median filter was used for the removal of different noises. This dataset will be combined with the existing PHWD-V2 dataset. Various deep-learning techniques were evaluated, including VGG19, MobileNetV2, MobileNetV3, and a customized CNN. The customized CNN demonstrated the highest accuracy and minimal loss, achieving a training accuracy of 93.98%, validation accuracy of 92.08% and testing accuracy of 92.99%.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10346912PMC
http://dx.doi.org/10.3390/s23136060DOI Listing

Publication Analysis

Top Keywords

pashto handwritten
12
characters ligatures
12
deep learning
8
handwritten characters
8
customized cnn
8
pashto
5
handwritten invariant
4
invariant character
4
character trajectory
4
trajectory prediction
4

Similar Publications

This article introduces a recognition system for handwritten text in the Pashto language, representing the first attempt to establish a baseline system using the Pashto Handwritten Text Imagebase (PHTI) dataset. Initially, the PHTI dataset underwent pre-processed to eliminate unwanted characters, subsequently, the dataset was divided into training 70%, validation 15%, and test sets 15%. The proposed recognition system is based on multi-dimensional long short-term memory (MD-LSTM) networks.

View Article and Find Full Text PDF

Before the 19th century, all communication and official records relied on handwritten documents, cherished as valuable artefacts by different ethnic groups. While significant efforts have been made to automate the transcription of major languages like English, French, Arabic, and Chinese, there has been less research on regional and minor languages, despite their importance from geographical and historical perspectives. This research focuses on detecting and recognizing Pashto handwritten characters and ligatures, which is essential for preserving this regional cursive language in Pakistan and its status as the national language of Afghanistan.

View Article and Find Full Text PDF

Recognition of Pashto Handwritten Characters Based on Deep Learning.

Sensors (Basel)

October 2020

Department of Robot System Engineering, Tongmyong University, Busan 48520, Korea.

Handwritten character recognition is increasingly important in a variety of automation fields, for example, authentication of bank signatures, identification of ZIP codes on letter addresses, and forensic evidence. Despite improved object recognition technologies, Pashto's hand-written character recognition (PHCR) remains largely unsolved due to the presence of many enigmatic hand-written characters, enormously cursive Pashto characters, and lack of research attention. We propose a convolutional neural network (CNN) model for recognition of Pashto hand-written characters for the first time in an unrestricted environment.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!