IEEE Trans Syst Man Cybern B Cybern
April 2011
In this paper, we propose a new linguistic-based approach called the affixal approach for Arabic word and text image recognition. Most of the existing works in the field integrate the knowledge of the Arabic language in the recognition process in two ways: either in post-recognition using the language of dictionary (dictionary of words) to validate the word hypotheses suggested by the OCR or in the course of the recognition process (recognition directed by a lexicon) using a statistical model of the language (Hidden Markov Model or N-gram). The proposed approach uses the linguistic concepts of the vocabulary to direct and simplify the recognition process.
View Article and Find Full Text PDF