Using parallel pivot vs. clustering-based techniques for web engines

Web Engines are a useful tool for searching information in the Web. But a great part of this information is non-textual and for that case a metric space is used. A metric space is a set where a notion of distance (called a metric) between elements of the set is defined. In this paper we present a...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Gil Costa, Graciela Verónica, Printista, Alicia Marcela
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2007
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/22974
Aporte de:
Descripción
Sumario:Web Engines are a useful tool for searching information in the Web. But a great part of this information is non-textual and for that case a metric space is used. A metric space is a set where a notion of distance (called a metric) between elements of the set is defined. In this paper we present an efficient parallelization of a pivot-based method devised for this purpose which is called the Sparse Spatial Selection (SSS) strategy and we compare it with a clustering-based method, a parallel implementation of the Spatial Approximation Tree (SAT). We show that SAT compares favourably against the pivot data structures SSS. The experimental results were obtained on a highperformance cluster and using several metric spaces, that shows load balance parallel strategies for the SAT. The implementations are built upon the BSP parallel computing model, which shows efficient performance for this application domain and allows a precise evaluation of algorithms.