Manuscript document digitalization and recognition: a first approach

The handwritten manuscript recognizing process belongs to a set of initiatives which lean to the preservation of cultural patrimony gathered in libraries and archives, where there exist a great wealth in documents and even handwritten cards that accompany incunabula books. This work is the starting...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: De Giusti, Marisa Raquel, Vila, María Marta, Villarreal, Gonzalo Luján
Formato: Articulo
Lenguaje:Inglés
Publicado: 2005
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/5521
http://journal.info.unlp.edu.ar/wp-content/uploads/JCST-Oct05-7.pdf
Aporte de:
id I19-R120-10915-5521
record_format dspace
institution Universidad Nacional de La Plata
institution_str I-19
repository_str R-120
collection SEDICI (UNLP)
language Inglés
topic Ciencias Informáticas
Image processing software
digitalización
conservación patrimonial
spellingShingle Ciencias Informáticas
Image processing software
digitalización
conservación patrimonial
De Giusti, Marisa Raquel
Vila, María Marta
Villarreal, Gonzalo Luján
Manuscript document digitalization and recognition: a first approach
topic_facet Ciencias Informáticas
Image processing software
digitalización
conservación patrimonial
description The handwritten manuscript recognizing process belongs to a set of initiatives which lean to the preservation of cultural patrimony gathered in libraries and archives, where there exist a great wealth in documents and even handwritten cards that accompany incunabula books. This work is the starting point of a research and development project oriented to digitalization and recognition of manuscript materials. The paper presented here discuss different algorithms used in the first stage dedicated to "image noise-cleaning" in order to improve it before the character recognition process begins. In order to make the handwritten-text recognition and image digitalization process efficient, it must be preceded by a preprocessing stage of the image to be treated, which includes thresholding, noise cleaning, thinning, base-line alignment and image segmentation, among others. Each of these steps will allow us to reduce the injurious variability when recognizing manuscripts (noise, random gray levels, slanted characters, ink level in different zones), and so increasing the probability of obtaining a suitable text recognition. In this paper, two image thinning methods are considered, and implemented. Finally, an evaluation is carried out obtaining many conclusions related to efficiency, speed and requirements, as well as ideas for future implementations.
format Articulo
Articulo
author De Giusti, Marisa Raquel
Vila, María Marta
Villarreal, Gonzalo Luján
author_facet De Giusti, Marisa Raquel
Vila, María Marta
Villarreal, Gonzalo Luján
author_sort De Giusti, Marisa Raquel
title Manuscript document digitalization and recognition: a first approach
title_short Manuscript document digitalization and recognition: a first approach
title_full Manuscript document digitalization and recognition: a first approach
title_fullStr Manuscript document digitalization and recognition: a first approach
title_full_unstemmed Manuscript document digitalization and recognition: a first approach
title_sort manuscript document digitalization and recognition: a first approach
publishDate 2005
url http://sedici.unlp.edu.ar/handle/10915/5521
http://journal.info.unlp.edu.ar/wp-content/uploads/JCST-Oct05-7.pdf
work_keys_str_mv AT degiustimarisaraquel manuscriptdocumentdigitalizationandrecognitionafirstapproach
AT vilamariamarta manuscriptdocumentdigitalizationandrecognitionafirstapproach
AT villarrealgonzalolujan manuscriptdocumentdigitalizationandrecognitionafirstapproach
bdutipo_str Repositorios
_version_ 1764820476730802178