Efficient analytical queries on semantic web data cubes

"The amount of multidimensional data published on the semantic web (SW) is constantly increasing, due to initiatives such as Open Data and Open Government Data, among other ones. Models, languages, and tools, that allow obtaining valuable information e ciently, are thus required. Multidimension...

Descripción completa

Detalles Bibliográficos
Autores principales: Etcheverry, Lorena, Vaisman, Alejandro Ariel
Formato: Artículos de Publicaciones Periódicas acceptedVersion
Lenguaje:Inglés
Publicado: 2019
Materias:
RDF
Acceso en línea:http://ri.itba.edu.ar/handle/123456789/1743
Aporte de:
id I32-R138-123456789-1743
record_format dspace
spelling I32-R138-123456789-17432022-12-07T13:06:44Z Efficient analytical queries on semantic web data cubes Etcheverry, Lorena Vaisman, Alejandro Ariel OLAP RDF WEB SEMANTICA ALMACENES DE DATOS LENGUAJES DE CONSULTA "The amount of multidimensional data published on the semantic web (SW) is constantly increasing, due to initiatives such as Open Data and Open Government Data, among other ones. Models, languages, and tools, that allow obtaining valuable information e ciently, are thus required. Multidimensional data are typically represented as data cubes, and exploited using Online Analytical Processing (OLAP) techniques. The RDF Data Cube Vocabulary, also denoted QB, is the current W3C standard to represent statistical data on the SW. Given that QB does not include key features needed for OLAP analysis, in previous work we have proposed an extension, denoted QB4OLAP, to overcome this problem without the need of modifying already published data. Once data cubes are appropriately represented on the SW, we need mechanisms to analyze them. However, in the current state-of-the-art, writing e cient analytical queries over SW data cubes demands a deep knowledge of standards like RDF and SPARQL. These skills are unlikely to be found in typical analytical users. Further, OLAP languages like MDX are far from being easily understood by the final user. The lack of friendly tools to exploit multidimensional data on the SW is a barrier that needs to be broken to promote the publication of such data. This is the problem we address in this paper. Our approach is based on allowing analytical users to write queries using what they know best: OLAP operations over data cubes, without dealing with SW technicalities. For this, we devised CQL (standing for Cube Query Language), a simple, high-level query language that operates over data cubes. Taking advantage of structural metadata provided by QB4OLAP, we translate CQL queries into SPARQL ones. Then, we propose query improvement strategies to produce e cient SPARQL queries, adapting general-purpose SPARQL query optimization techniques. We evaluate our implementation using the Star-Schema benchmark, showing that our proposal outperforms others. The QB4OLAP toolkit,a web application that allows exploring and querying (using CQL) SW data cubes, completes our contributions." 2019-09-06T13:20:52Z 2019-09-06T13:20:52Z 2017-12 Artículos de Publicaciones Periódicas info:eu-repo/semantics/acceptedVersion 1861-2032 http://ri.itba.edu.ar/handle/123456789/1743 en info:eu-repo/semantics/altIdentifier/doi/10.1007/s13740-017-0082-y info:eu-repo/grantAgreement/ANPCyT/PICT/2014-0787/AR. Ciudad Autónoma de Buenos Aires application/pdf
institution Instituto Tecnológico de Buenos Aires (ITBA)
institution_str I-32
repository_str R-138
collection Repositorio Institucional Instituto Tecnológico de Buenos Aires (ITBA)
language Inglés
topic OLAP
RDF
WEB SEMANTICA
ALMACENES DE DATOS
LENGUAJES DE CONSULTA
spellingShingle OLAP
RDF
WEB SEMANTICA
ALMACENES DE DATOS
LENGUAJES DE CONSULTA
Etcheverry, Lorena
Vaisman, Alejandro Ariel
Efficient analytical queries on semantic web data cubes
topic_facet OLAP
RDF
WEB SEMANTICA
ALMACENES DE DATOS
LENGUAJES DE CONSULTA
description "The amount of multidimensional data published on the semantic web (SW) is constantly increasing, due to initiatives such as Open Data and Open Government Data, among other ones. Models, languages, and tools, that allow obtaining valuable information e ciently, are thus required. Multidimensional data are typically represented as data cubes, and exploited using Online Analytical Processing (OLAP) techniques. The RDF Data Cube Vocabulary, also denoted QB, is the current W3C standard to represent statistical data on the SW. Given that QB does not include key features needed for OLAP analysis, in previous work we have proposed an extension, denoted QB4OLAP, to overcome this problem without the need of modifying already published data. Once data cubes are appropriately represented on the SW, we need mechanisms to analyze them. However, in the current state-of-the-art, writing e cient analytical queries over SW data cubes demands a deep knowledge of standards like RDF and SPARQL. These skills are unlikely to be found in typical analytical users. Further, OLAP languages like MDX are far from being easily understood by the final user. The lack of friendly tools to exploit multidimensional data on the SW is a barrier that needs to be broken to promote the publication of such data. This is the problem we address in this paper. Our approach is based on allowing analytical users to write queries using what they know best: OLAP operations over data cubes, without dealing with SW technicalities. For this, we devised CQL (standing for Cube Query Language), a simple, high-level query language that operates over data cubes. Taking advantage of structural metadata provided by QB4OLAP, we translate CQL queries into SPARQL ones. Then, we propose query improvement strategies to produce e cient SPARQL queries, adapting general-purpose SPARQL query optimization techniques. We evaluate our implementation using the Star-Schema benchmark, showing that our proposal outperforms others. The QB4OLAP toolkit,a web application that allows exploring and querying (using CQL) SW data cubes, completes our contributions."
format Artículos de Publicaciones Periódicas
acceptedVersion
author Etcheverry, Lorena
Vaisman, Alejandro Ariel
author_facet Etcheverry, Lorena
Vaisman, Alejandro Ariel
author_sort Etcheverry, Lorena
title Efficient analytical queries on semantic web data cubes
title_short Efficient analytical queries on semantic web data cubes
title_full Efficient analytical queries on semantic web data cubes
title_fullStr Efficient analytical queries on semantic web data cubes
title_full_unstemmed Efficient analytical queries on semantic web data cubes
title_sort efficient analytical queries on semantic web data cubes
publishDate 2019
url http://ri.itba.edu.ar/handle/123456789/1743
work_keys_str_mv AT etcheverrylorena efficientanalyticalqueriesonsemanticwebdatacubes
AT vaismanalejandroariel efficientanalyticalqueriesonsemanticwebdatacubes
_version_ 1765660911134048256