Leveraging Large Language Models for Ontology-Based Data Access: A Preliminary Analysis

In Ontology-Based Data Access (OBDA), we study how to represent legacy data sources using ontologies. This enables a modern, distributed, uniform data representation format with the ability to perform intelligent querying and processing. This task requires the development of software to interpret th...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Gómez, Sergio Alejandro, Fillottrani, Pablo Rubén
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2024
Materias:
CSV
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/176820
Aporte de:
id I19-R120-10915-176820
record_format dspace
spelling I19-R120-10915-1768202025-02-21T20:03:44Z http://sedici.unlp.edu.ar/handle/10915/176820 Leveraging Large Language Models for Ontology-Based Data Access: A Preliminary Analysis Gómez, Sergio Alejandro Fillottrani, Pablo Rubén 2024-10 2024 2025-02-21T17:58:03Z en Ciencias Informáticas Ontology-Based Data Access Large Language Models Ontologies CSV In Ontology-Based Data Access (OBDA), we study how to represent legacy data sources using ontologies. This enables a modern, distributed, uniform data representation format with the ability to perform intelligent querying and processing. This task requires the development of software to interpret the data and express it as ontologies, which takes considerable time. On the other hand, large language models (LLM) have lately shown themselves to be great solution providers due to their ability to generate solutions from input specified in natural language by an end user. In this paper, we explore the potential of LLM to perform OBDA automatically. Our research hypothesis is that is possible to use an LLM tool like ChatGPT to perform OBDA. For this purpose, we studied ChatGPT responses with different problems associated with OBDA. We discovered that ChatGPT is able to generate ontologies from free text as well as from tables expressed as text or in CSV format. ChatGPT is also able to generate SPARQL queries, and it is also successful in expressing relational tables as ontologies being capable of correcting violations of integrity constraints when appropriately directed. Red de Universidades con Carreras en Informática Objeto de conferencia Objeto de conferencia http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) application/pdf 996-1005
institution Universidad Nacional de La Plata
institution_str I-19
repository_str R-120
collection SEDICI (UNLP)
language Inglés
topic Ciencias Informáticas
Ontology-Based Data Access
Large Language Models
Ontologies
CSV
spellingShingle Ciencias Informáticas
Ontology-Based Data Access
Large Language Models
Ontologies
CSV
Gómez, Sergio Alejandro
Fillottrani, Pablo Rubén
Leveraging Large Language Models for Ontology-Based Data Access: A Preliminary Analysis
topic_facet Ciencias Informáticas
Ontology-Based Data Access
Large Language Models
Ontologies
CSV
description In Ontology-Based Data Access (OBDA), we study how to represent legacy data sources using ontologies. This enables a modern, distributed, uniform data representation format with the ability to perform intelligent querying and processing. This task requires the development of software to interpret the data and express it as ontologies, which takes considerable time. On the other hand, large language models (LLM) have lately shown themselves to be great solution providers due to their ability to generate solutions from input specified in natural language by an end user. In this paper, we explore the potential of LLM to perform OBDA automatically. Our research hypothesis is that is possible to use an LLM tool like ChatGPT to perform OBDA. For this purpose, we studied ChatGPT responses with different problems associated with OBDA. We discovered that ChatGPT is able to generate ontologies from free text as well as from tables expressed as text or in CSV format. ChatGPT is also able to generate SPARQL queries, and it is also successful in expressing relational tables as ontologies being capable of correcting violations of integrity constraints when appropriately directed.
format Objeto de conferencia
Objeto de conferencia
author Gómez, Sergio Alejandro
Fillottrani, Pablo Rubén
author_facet Gómez, Sergio Alejandro
Fillottrani, Pablo Rubén
author_sort Gómez, Sergio Alejandro
title Leveraging Large Language Models for Ontology-Based Data Access: A Preliminary Analysis
title_short Leveraging Large Language Models for Ontology-Based Data Access: A Preliminary Analysis
title_full Leveraging Large Language Models for Ontology-Based Data Access: A Preliminary Analysis
title_fullStr Leveraging Large Language Models for Ontology-Based Data Access: A Preliminary Analysis
title_full_unstemmed Leveraging Large Language Models for Ontology-Based Data Access: A Preliminary Analysis
title_sort leveraging large language models for ontology-based data access: a preliminary analysis
publishDate 2024
url http://sedici.unlp.edu.ar/handle/10915/176820
work_keys_str_mv AT gomezsergioalejandro leveraginglargelanguagemodelsforontologybaseddataaccessapreliminaryanalysis
AT fillottranipabloruben leveraginglargelanguagemodelsforontologybaseddataaccessapreliminaryanalysis
_version_ 1845116795673903104