Elaboration d’un OCR basé sur les modèle de Markov cachés :  application au texte Arabe imprimé non voyellé

Benkessirat, Walid; Khenniche, Oussama

dc.contributor.author	Benkessirat, Walid
dc.contributor.author	Khenniche, Oussama
dc.date.accessioned	2020-02-02T08:02:33Z
dc.date.available	2020-02-02T08:02:33Z
dc.date.issued	2019
dc.identifier.uri	http://di.univ-blida.dz:8080/jspui/handle/123456789/5102
dc.description	ill.,Bibliogr.	fr_FR
dc.description.abstract	Des efforts considérables ont été déployé pour le développement des systèmes optique de reconnaissance de caractères, par la communauté des chercheurs. Le but de ce projet est l’implémentation d’un AOCR (Arabic Optical Character Recognition). La segmentation et la classification sont les opérations cœur des OCR en général. La nature cursive des caractères Arabe biaise les résultats finaux de la reconnaissance. Les caractères non segmentés ou sursegmentés conduisent à de mauvais résultats. C’est pour cela que la segmentation et la classification dans les AOCR sont un sérieux problème de recherche. La segmentation d’un texte Arabe comprends 3 niveaux, à savoir la segmentation en ligne, en pseudo-mots et en caractères. Au cours de notre projet, nous avons choisi les techniques du contour, la projection verticale et template matching pour faire la segmentation des 3 niveaux respectivement. D’autre part, la classification comprend deux niveaux, à savoir la classification des caractères et la classification des mots. Au cours de notre projets nous avons choisi des modèles de classification basé sur les Modèles de Markov Cachés (HMM Hidden Markovian Model). Au cours de notre projet, nous avons aussi discuter quelques problèmes liés à la langue Arabe et étudier d’autre module concernant les OCR, à savoir l’acquisition des données, le prétraitement et l’extraction des caractéristiques. Les résultats d’implémentation sont prometteurs. Mots clés : reconnaissance, segmentation, classification, caractère, Arabe… Tremendous efforts have been put into the development of OCR systems by the community researchers. The aim of this project is the implementation of an Arabic Optical Character Recognition (AOCR). Segmentation and classification are the main operations of OCRs in general. The cursive nature of Arabic characters biases the final results of recognition. Unsegmented or over-segmented characters lead to wrong results. This is why segmentation and classification in AOCRs is a serious research problem. The segmentation of an Arabic text includes 3 levels, namely line segmentation, pseudo-words and characters segmentation. In our project, we chose the techniques of the contour, the vertical projection and template matching to make the segmentation of the 3 levels respectively. On the other hand, the classification has two levels, namely the characters classification and the words classification. In our projects we have chosen classification models based on the Hidden Markovian Model (HMM). We also discussed some problems related to the Arabic language and study another module concerning OCR, namely data acquisition, preprocessing and features extraction. The implementation results are promising Keywords: recognition, segmentation, classification, character, Arabic ...	fr_FR
dc.language.iso	fr	fr_FR
dc.publisher	Université Blida 1	fr_FR
dc.subject	reconnaissance	fr_FR
dc.subject	segmentation	fr_FR
dc.subject	lassification	fr_FR
dc.subject	caractère	fr_FR
dc.subject	rabe	fr_FR
dc.subject	recognition	fr_FR
dc.subject	segmentation	fr_FR
dc.subject	classification	fr_FR
dc.subject	character	fr_FR
dc.subject	Arabic	fr_FR
dc.title	Elaboration d’un OCR basé sur les modèle de Markov cachés : application au texte Arabe imprimé non voyellé	fr_FR
dc.type	Thesis	fr_FR