TreeSpark: A Distributed Tool for Progeny Analysis based on Spark

Progeny analyses are useful in biological sciences for various purposes, such as improving individuals in new generations or carrying out molecular analysis of the transmission of genetic characteristics. Analyzing these data by making comparisons between individuals of a generation with their offsp...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: López, Paula, Hasperué, Waldo, Quiroga, Facundo Manuel, Ronchetti, Franco
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2021
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/130340
Aporte de:
id I19-R120-10915-130340
record_format dspace
institution Universidad Nacional de La Plata
institution_str I-19
repository_str R-120
collection SEDICI (UNLP)
language Inglés
topic Ciencias Informáticas
Spark
Big data
Progeny analysis
Genealogy
Analytics
spellingShingle Ciencias Informáticas
Spark
Big data
Progeny analysis
Genealogy
Analytics
López, Paula
Hasperué, Waldo
Quiroga, Facundo Manuel
Ronchetti, Franco
TreeSpark: A Distributed Tool for Progeny Analysis based on Spark
topic_facet Ciencias Informáticas
Spark
Big data
Progeny analysis
Genealogy
Analytics
description Progeny analyses are useful in biological sciences for various purposes, such as improving individuals in new generations or carrying out molecular analysis of the transmission of genetic characteristics. Analyzing these data by making comparisons between individuals of a generation with their offspring is not a trivial task, and increases in complexity as more and more generations are incorporated. In this article, we present TreeSpark, an open source tool to carry out progeny analysis and provides functionality that allows simple access to the information of the individuals and their relations both as progenitors and descendants. This tool is developed as a Python module, which in turn inherits the distributed processing features of Spark, allowing it to process large volumes of progeny information. TreeSpark is compared with other similar tools, finding TreeSpark much simpler to use.
format Objeto de conferencia
Objeto de conferencia
author López, Paula
Hasperué, Waldo
Quiroga, Facundo Manuel
Ronchetti, Franco
author_facet López, Paula
Hasperué, Waldo
Quiroga, Facundo Manuel
Ronchetti, Franco
author_sort López, Paula
title TreeSpark: A Distributed Tool for Progeny Analysis based on Spark
title_short TreeSpark: A Distributed Tool for Progeny Analysis based on Spark
title_full TreeSpark: A Distributed Tool for Progeny Analysis based on Spark
title_fullStr TreeSpark: A Distributed Tool for Progeny Analysis based on Spark
title_full_unstemmed TreeSpark: A Distributed Tool for Progeny Analysis based on Spark
title_sort treespark: a distributed tool for progeny analysis based on spark
publishDate 2021
url http://sedici.unlp.edu.ar/handle/10915/130340
work_keys_str_mv AT lopezpaula treesparkadistributedtoolforprogenyanalysisbasedonspark
AT hasperuewaldo treesparkadistributedtoolforprogenyanalysisbasedonspark
AT quirogafacundomanuel treesparkadistributedtoolforprogenyanalysisbasedonspark
AT ronchettifranco treesparkadistributedtoolforprogenyanalysisbasedonspark
bdutipo_str Repositorios
_version_ 1764820453298274306