Evaluation of LSA performance in Spanish using multiple corpus of text

Latent Semantic Analysis is a natural language processing tools that allows estimating semantic distance between terms. The success of LSA is mainly based on the training corpus choice, which have been studied principally in English. This study focuses on studying LSA with regional Spanish corpus an...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Carrillo, Facundo, Cecchi, Guillermo, Sigman, Mariano, Fernández Slezak, Diego
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2013
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/76358
http://42jaiio.sadio.org.ar/proceedings/simposios/Trabajos/ASAI/18.pdf
Aporte de:
id I19-R120-10915-76358
record_format dspace
institution Universidad Nacional de La Plata
institution_str I-19
repository_str R-120
collection SEDICI (UNLP)
language Inglés
topic Ciencias Informáticas
Latent Semantic Analysis
regional Spanish corpus
Natural Language Processing
spellingShingle Ciencias Informáticas
Latent Semantic Analysis
regional Spanish corpus
Natural Language Processing
Carrillo, Facundo
Cecchi, Guillermo
Sigman, Mariano
Fernández Slezak, Diego
Evaluation of LSA performance in Spanish using multiple corpus of text
topic_facet Ciencias Informáticas
Latent Semantic Analysis
regional Spanish corpus
Natural Language Processing
description Latent Semantic Analysis is a natural language processing tools that allows estimating semantic distance between terms. The success of LSA is mainly based on the training corpus choice, which have been studied principally in English. This study focuses on studying LSA with regional Spanish corpus and evaluate the performance by identifying synonyms. We found that performance was slightly better than chance, concordantly with previous results. Standard LSA method cannot dynamically increase the training corpus. By using classifiers we combined multiple LSA models and showed that the use of automatic classifiers increase the performance.
format Objeto de conferencia
Objeto de conferencia
author Carrillo, Facundo
Cecchi, Guillermo
Sigman, Mariano
Fernández Slezak, Diego
author_facet Carrillo, Facundo
Cecchi, Guillermo
Sigman, Mariano
Fernández Slezak, Diego
author_sort Carrillo, Facundo
title Evaluation of LSA performance in Spanish using multiple corpus of text
title_short Evaluation of LSA performance in Spanish using multiple corpus of text
title_full Evaluation of LSA performance in Spanish using multiple corpus of text
title_fullStr Evaluation of LSA performance in Spanish using multiple corpus of text
title_full_unstemmed Evaluation of LSA performance in Spanish using multiple corpus of text
title_sort evaluation of lsa performance in spanish using multiple corpus of text
publishDate 2013
url http://sedici.unlp.edu.ar/handle/10915/76358
http://42jaiio.sadio.org.ar/proceedings/simposios/Trabajos/ASAI/18.pdf
work_keys_str_mv AT carrillofacundo evaluationoflsaperformanceinspanishusingmultiplecorpusoftext
AT cecchiguillermo evaluationoflsaperformanceinspanishusingmultiplecorpusoftext
AT sigmanmariano evaluationoflsaperformanceinspanishusingmultiplecorpusoftext
AT fernandezslezakdiego evaluationoflsaperformanceinspanishusingmultiplecorpusoftext
bdutipo_str Repositorios
_version_ 1764820488117288961