Handwritten Arabic character recognition has received increasing research interest in recent years. However, as of yet, the majority of the existing handwriting recognition systems have only focused on adult handwriting. In contrast, there have not been many studies conducted on child handwriting, nor has it been regarded as a major research issue yet. Compared to adults' handwriting, children's handwriting is more challenging since it often has lower quality, higher variation, and larger distortions. Furthermore, most of these designed and currently used systems for adult data have not been trained or tested for child data recognition purposes or applications. This paper presents a new convolution neural network (CNN) model for recognizing children's handwritten isolated Arabic letters. Several experiments are conducted here to investigate and analyze the influence when training the model with different datasets of children, adults, and both to measure and compare performance in recognizing children's handwritten characters and discriminating their handwriting from adult handwriting. In addition, a number of supplementary features are proposed based on empirical study and observations and are combined with CNN-extracted features to augment the child and adult writer-group classification. Lastly, the performance of the extracted deep and supplementary features is evaluated and compared using different classifiers, comprising Softmax, support vector machine (SVM), k-nearest neighbor (KNN), and random forest (RF), as well as different dataset combinations from Hijja for child data and AHCD for adult data. Our findings highlight that the training strategy is crucial, and the inclusion of adult data is influential in achieving an increased accuracy of up to around 93% in child handwritten character recognition. Moreover, the fusion of the proposed supplementary features with the deep features attains an improved performance in child handwriting discrimination by up to around 94%.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10422599 | PMC |
http://dx.doi.org/10.3390/s23156774 | DOI Listing |
Sensors (Basel)
January 2025
Department of Architectural Engineering, Dankook University, 152 Jukjeon-ro, Yongin-si 16890, Republic of Korea.
In the construction industry, ensuring the proper installation, retention, and dismantling of temporary structures, such as jack supports, is critical to maintaining safety and project timelines. However, inconsistencies between on-site data and construction documentation remain a significant challenge. To address this, this study proposes an integrated monitoring framework that combines computer vision-based object detection and document recognition techniques.
View Article and Find Full Text PDFSci Rep
January 2025
Nanfang College Guangzhou, Guangzhou, 510970, China.
Named Entity Recognition (NER) is an essential component of numerous Natural Language Processing (NLP) systems, with the aim of identifying and classifying entities that have specific meanings in raw text, such as person (PER), location (LOC), and organization (ORG). Recently, Deep Neural Networks (DNNs) have been extensively applied to NER tasks owing to the rapid development of deep learning technology. However, despite their advancements, these models fail to take full advantage of the multi-level features (e.
View Article and Find Full Text PDFJ Imaging
January 2025
School of Information Technology, Sripatum University, Bangkok 10900, Thailand.
This study introduces a novel AI-driven approach to support elderly patients in Thailand with medication management, focusing on accurate drug label interpretation. Two model architectures were explored: a Two-Stage Optical Character Recognition (OCR) and Large Language Model (LLM) pipeline combining EasyOCR with Qwen2-72b-instruct and a Uni-Stage Visual Question Answering (VQA) model using Qwen2-72b-VL. Both models operated in a zero-shot capacity, utilizing Retrieval-Augmented Generation (RAG) with DrugBank references to ensure contextual relevance and accuracy.
View Article and Find Full Text PDFFront Artif Intell
January 2025
School of Medicine, Stanford University, Palo Alto, CA, United States.
Given close relationships between ocular structure and ophthalmic disease, ocular biometry measurements (including axial length, lens thickness, anterior chamber depth, and keratometry values) may be leveraged as features in the prediction of eye diseases. However, ocular biometry measurements are often stored as PDFs rather than as structured data in electronic health records. Thus, time-consuming and laborious manual data entry is required for using biometry data as a disease predictor.
View Article and Find Full Text PDFMethodsX
June 2025
Computer Science Department, Information Technology University of Punjab, Lahore, Pakistan.
Optical character recognition (OCR) is vital in digitizing printed data into a digital format, which can be conveniently used for various purposes. A significant amount of work has been done in OCR for well-resourced languages like English. However, languages like Urdu, spoken by a large community, face limitations in OCR due to a lack of resources and the complexity and diversity of handwritten scripts.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!