Scraping Archaeology: A Methodological Approach from the Web Scraping and Text Mining

As the amount of information available on the web increases, so does the task of locating and analysing it, and performing this task manually can be costly in terms of time and effort. Although search engines and database engines can help to find the required information, in large digital infrastruc...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autor principal: Aguilar, Humberto
Formato: Artículo revista
Lenguaje:Español
Publicado: Facultad de Filosofía y Humanidades. Museo de Antropología 2023
Materias:
R
Acceso en línea:https://revistas.unc.edu.ar/index.php/antropologia/article/view/41094
Aporte de:
id I10-R372-article-41094
record_format ojs
institution Universidad Nacional de Córdoba
institution_str I-10
repository_str R-372
container_title_str Revista del Museo de Antropología
language Español
format Artículo revista
topic R
Web scraping
Text mining
Data analytics
Digital Archaeology
R
Web scraping
Text mining
Análisis de datos
Arqueología digital
R
Web scraping
Text mining
Análise de dados
Arqueologia digital
spellingShingle R
Web scraping
Text mining
Data analytics
Digital Archaeology
R
Web scraping
Text mining
Análisis de datos
Arqueología digital
R
Web scraping
Text mining
Análise de dados
Arqueologia digital
Aguilar, Humberto
Aguilar, Humberto
Aguilar, Humberto
Scraping Archaeology: A Methodological Approach from the Web Scraping and Text Mining
topic_facet R
Web scraping
Text mining
Data analytics
Digital Archaeology
R
Web scraping
Text mining
Análisis de datos
Arqueología digital
R
Web scraping
Text mining
Análise de dados
Arqueologia digital
author Aguilar, Humberto
Aguilar, Humberto
Aguilar, Humberto
author_facet Aguilar, Humberto
Aguilar, Humberto
Aguilar, Humberto
author_sort Aguilar, Humberto
title Scraping Archaeology: A Methodological Approach from the Web Scraping and Text Mining
title_short Scraping Archaeology: A Methodological Approach from the Web Scraping and Text Mining
title_full Scraping Archaeology: A Methodological Approach from the Web Scraping and Text Mining
title_fullStr Scraping Archaeology: A Methodological Approach from the Web Scraping and Text Mining
title_full_unstemmed Scraping Archaeology: A Methodological Approach from the Web Scraping and Text Mining
title_sort scraping archaeology: a methodological approach from the web scraping and text mining
description As the amount of information available on the web increases, so does the task of locating and analysing it, and performing this task manually can be costly in terms of time and effort. Although search engines and database engines can help to find the required information, in large digital infrastructures where search results are in the thousands - or more - new tools are needed to effectively retrieve the searched content. This paper proposes the application of Web Scraping and Text Mining as methodological inputs to be able to compile and process large volumes of data in digital infrastructures in a more automated way. The automation of both processes provides a great advantage in analysing textual corpora of thousands of records, which significantly simplifies the collection of different types of data, facilitating the work considerably. It is hoped that this contribution will expand the possibilities of the archaeological community in terms of a novel methodology for the collection and handling of structured and unstructured data that can be integrated into the research of the wider archaeological community.
publisher Facultad de Filosofía y Humanidades. Museo de Antropología
publishDate 2023
url https://revistas.unc.edu.ar/index.php/antropologia/article/view/41094
work_keys_str_mv AT aguilarhumberto scrapingarchaeologyamethodologicalapproachfromthewebscrapingandtextmining
AT aguilarhumberto scrapingarchaeologyamethodologicalapproachfromthewebscrapingandtextmining
AT aguilarhumberto scrapingarchaeologyamethodologicalapproachfromthewebscrapingandtextmining
AT aguilarhumberto raspandolaarqueologiaunaaproximacionmetodologicadesdeelwebscrapingytextmining
AT aguilarhumberto raspandolaarqueologiaunaaproximacionmetodologicadesdeelwebscrapingytextmining
AT aguilarhumberto raspandolaarqueologiaunaaproximacionmetodologicadesdeelwebscrapingytextmining
AT aguilarhumberto arqueologiaderaspagemumaabordagemmetodologicapararaspagemdawebemineracaodetexto
AT aguilarhumberto arqueologiaderaspagemumaabordagemmetodologicapararaspagemdawebemineracaodetexto
AT aguilarhumberto arqueologiaderaspagemumaabordagemmetodologicapararaspagemdawebemineracaodetexto
first_indexed 2024-09-03T20:03:39Z
last_indexed 2025-03-27T05:35:17Z
_version_ 1827724294801588224
spelling I10-R372-article-410942024-11-07T19:44:23Z Scraping Archaeology: A Methodological Approach from the Web Scraping and Text Mining Raspando la Arqueología: Una Aproximación Metodológica desde el Web Scraping y Text Mining Arqueologia de raspagem: uma abordagem metodológica para raspagem da Web e mineração de texto Aguilar, Humberto Aguilar, Humberto Aguilar, Humberto R Web scraping Text mining Data analytics Digital Archaeology R Web scraping Text mining Análisis de datos Arqueología digital R Web scraping Text mining Análise de dados Arqueologia digital As the amount of information available on the web increases, so does the task of locating and analysing it, and performing this task manually can be costly in terms of time and effort. Although search engines and database engines can help to find the required information, in large digital infrastructures where search results are in the thousands - or more - new tools are needed to effectively retrieve the searched content. This paper proposes the application of Web Scraping and Text Mining as methodological inputs to be able to compile and process large volumes of data in digital infrastructures in a more automated way. The automation of both processes provides a great advantage in analysing textual corpora of thousands of records, which significantly simplifies the collection of different types of data, facilitating the work considerably. It is hoped that this contribution will expand the possibilities of the archaeological community in terms of a novel methodology for the collection and handling of structured and unstructured data that can be integrated into the research of the wider archaeological community. A medida que la cantidad de información disponible en la web aumenta, también lo hace la tarea de localizarla y analizarla, por lo cual realizar esta tarea de forma manual puede ser costosa en función al tiempo y esfuerzo invertido. Aunque los buscadores y los motores de bases de datos pueden ayudar a encontrar la información requerida, en infraestructuras digitales grandes donde los resultados de búsqueda se cuentan por millares – o más– se precisan de nuevas herramientas para obtener el contenido buscado de manera efectiva. Este trabajo propone la aplicación de Web Scraping y Text Mining como insumos metodológicos para poder compilar y procesar grandes volúmenes de datos en infraestructuras digitales de una forma más automatizada. La automatización de ambos procesos aporta una gran ventaja al analizar corpus textuales de miles de registros lo cual simplifica de manera significativa la obtención de diferentes tipos de datos, facilitando el trabajo considerablemente. Se espera que esta contribución permita ampliar las posibilidades de la comunidad arqueológica en clave de una metodología novedosa para la obtención y el manejo de datos estructurados y no estructurados que pueden ser integrados a las investigaciones de la comunidad arqueológica en general. À medida que o volume de informações disponíveis na Web aumenta, também aumenta a tarefa de localizá-las e analisá-las, e realizar essa tarefa manualmente pode ser dispendioso em termos de tempo e esforço. Embora os mecanismos de pesquisa e os mecanismos de banco de dados possam ajudar a encontrar as informações necessárias, em grandes infraestruturas digitais, onde os resultados de pesquisa são milhares ou mais, são necessárias novas ferramentas para recuperar com eficácia o conteúdo pesquisado. Este documento propõe a aplicação de Web Scraping e Text Mining como insumos metodológicos para poder compilar e processar grandes volumes de dados em infraestruturas digitais de forma mais automatizada. A automação de ambos os processos traz uma grande vantagem na análise de corpora textuais de milhares de registros, o que simplifica significativamente a obtenção de diferentes tipos de dados, facilitando consideravelmente o trabalho. Espera-se que esta contribuição amplie as possibilidades da comunidade arqueológica em termos de uma nova metodologia para a coleta e o manuseio de dados estruturados e não estruturados que possam ser integrados à pesquisa da comunidade arqueológica mais ampla. Facultad de Filosofía y Humanidades. Museo de Antropología 2023-12-28 info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion application/pdf https://revistas.unc.edu.ar/index.php/antropologia/article/view/41094 10.31048/1852.4826.v16.n2.41094 Revista del Museo de Antropología; Vol. 16 No. 3 (2023); 439-450 Revista del Museo de Antropología; Vol. 16 Núm. 3 (2023); 439-450 Revista del Museo de Antropología; v. 16 n. 3 (2023); 439-450 1852-4826 1852-060X 10.31048/1852.4826.v16.n2 spa https://revistas.unc.edu.ar/index.php/antropologia/article/view/41094/44471 Derechos de autor 2023 Humberto Aguilar https://creativecommons.org/licenses/by-nc-sa/4.0