Image and video text recognition using convolutional neural networks

Zohra Saidane

Thèse Année : 2008

Image and video text recognition using convolutional neural networks

Reconnaissance de texte dans les images et les vidéos en utilisant les réseaux de neurones à convolutions

(1)

Zohra Saidane

Fonction : Auteur

Laboratoire Traitement et Communication de l'Information

Résumé

Thanks to increasingly powerful storage media, multimedia resources have become nowadays essential resources, in the field of information and broadcasting (News Agency, INA), culture (museums), transport (monitoring), environment (satellite images), or medical imaging (medical records in hospitals). Thus, the challenge is how to quickly find relevant information. Therefore, research in multimedia is increasingly focused on indexing and retrieval techniques. To accomplish this task, the text within images and videos can be a relevant key. The challenges of recognizing text in images and videos are many: poor resolution, characters of different sizes, artifacts due to compression and effects of anti-recovery, very complex and variable background. There are four steps for the recognition of the text: (1) detecting the presence of the text, (2) localizing of the text, (3) extracting and enhancing the text area, and finally (4) recognizing the content of the text. In this work we will focus on this last step and we assume that the text box has been detected, located and retrieved correctly. This recognition module can also be divided into several sub-modules such as a binarization module, a text segmentation module, a character recognition module. We focused on a particular machine learning algorithm called convolutional neural networks (CNNs). These are networks of neurons whose topology is similar to the mammalian visual cortex. CNNs were initially used for recognition of handwritten digits. They were then applied successfully on many problems of pattern recognition. We propose in this thesis a new method of binarization of text images, a new method for segmentation of text images, the study of a convolutional neural network for character recognition in images, a discussion on the relevance of the binarization step in the recognition of text in images based on machine learning methods, and a new method of text recognition in images based on graph theory.

Mots clés

Text Recognition Neural Networks

Domaines

Fichier principal

phd_saidane_final.pdf (5.25 Mo)

Ecole Télécom ParisTech : Connectez-vous pour contacter le contributeur

https://pastel.hal.science/pastel-00004685

Soumis le : lundi 22 juin 2009-08:00:00

Dernière modification le : lundi 9 octobre 2023-12:49:40

Archivage à long terme le : dimanche 27 novembre 2016-00:37:08

Dates et versions

pastel-00004685 , version 1 (22-06-2009)

Identifiants

HAL Id : pastel-00004685 , version 1

Citer

Zohra Saidane. Image and video text recognition using convolutional neural networks. domain_other. Télécom ParisTech, 2008. English. ⟨NNT : ⟩. ⟨pastel-00004685⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM PASTEL CNRS EURECOM PARISTECH LTCI

562 Consultations

3146 Téléchargements

Image and video text recognition using convolutional neural networks

Reconnaissance de texte dans les images et les vidéos en utilisant les réseaux de neurones à convolutions

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager