Evaluation of LSA performance in Spanish using multiple corpus of text
Latent Semantic Analysis is a natural language processing tools that allows estimating semantic distance between terms. The success of LSA is mainly based on the training corpus choice, which have been studied principally in English. This study focuses on studying LSA with regional Spanish corpus an...
Guardado en:
Autores principales: | , , , |
---|---|
Formato: | Objeto de conferencia |
Lenguaje: | Inglés |
Publicado: |
2013
|
Materias: | |
Acceso en línea: | http://sedici.unlp.edu.ar/handle/10915/76358 http://42jaiio.sadio.org.ar/proceedings/simposios/Trabajos/ASAI/18.pdf |
Aporte de: |
id |
I19-R120-10915-76358 |
---|---|
record_format |
dspace |
institution |
Universidad Nacional de La Plata |
institution_str |
I-19 |
repository_str |
R-120 |
collection |
SEDICI (UNLP) |
language |
Inglés |
topic |
Ciencias Informáticas Latent Semantic Analysis regional Spanish corpus Natural Language Processing |
spellingShingle |
Ciencias Informáticas Latent Semantic Analysis regional Spanish corpus Natural Language Processing Carrillo, Facundo Cecchi, Guillermo Sigman, Mariano Fernández Slezak, Diego Evaluation of LSA performance in Spanish using multiple corpus of text |
topic_facet |
Ciencias Informáticas Latent Semantic Analysis regional Spanish corpus Natural Language Processing |
description |
Latent Semantic Analysis is a natural language processing tools that allows estimating semantic distance between terms. The success of LSA is mainly based on the training corpus choice, which have been studied principally in English. This study focuses on studying LSA with regional Spanish corpus and evaluate the performance by identifying synonyms. We found that performance was slightly better than chance, concordantly with previous results. Standard LSA method cannot dynamically increase the training corpus. By using classifiers we combined multiple LSA models and showed that the use of automatic classifiers increase the performance. |
format |
Objeto de conferencia Objeto de conferencia |
author |
Carrillo, Facundo Cecchi, Guillermo Sigman, Mariano Fernández Slezak, Diego |
author_facet |
Carrillo, Facundo Cecchi, Guillermo Sigman, Mariano Fernández Slezak, Diego |
author_sort |
Carrillo, Facundo |
title |
Evaluation of LSA performance in Spanish using multiple corpus of text |
title_short |
Evaluation of LSA performance in Spanish using multiple corpus of text |
title_full |
Evaluation of LSA performance in Spanish using multiple corpus of text |
title_fullStr |
Evaluation of LSA performance in Spanish using multiple corpus of text |
title_full_unstemmed |
Evaluation of LSA performance in Spanish using multiple corpus of text |
title_sort |
evaluation of lsa performance in spanish using multiple corpus of text |
publishDate |
2013 |
url |
http://sedici.unlp.edu.ar/handle/10915/76358 http://42jaiio.sadio.org.ar/proceedings/simposios/Trabajos/ASAI/18.pdf |
work_keys_str_mv |
AT carrillofacundo evaluationoflsaperformanceinspanishusingmultiplecorpusoftext AT cecchiguillermo evaluationoflsaperformanceinspanishusingmultiplecorpusoftext AT sigmanmariano evaluationoflsaperformanceinspanishusingmultiplecorpusoftext AT fernandezslezakdiego evaluationoflsaperformanceinspanishusingmultiplecorpusoftext |
bdutipo_str |
Repositorios |
_version_ |
1764820488117288961 |