HoSeIn: A Workflow for Integrating Various Homology Search Results from Metagenomic and Metatranscriptomic Sequence Datasets

Data generated by metagenomic and metatranscriptomic experiments is both enormous and inherently noisy. When using taxonomy-dependent alignment-based methods to classify and label reads, the first step consists in performing homology searches against sequence databases. To obtain the most informatio...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Rozadilla, Gastón, Moreiras Clemente, Jorgelina, McCarthy, Christina Beryl
Formato: Articulo
Lenguaje:Inglés
Publicado: 2020
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/132750
Aporte de:
id I19-R120-10915-132750
record_format dspace
institution Universidad Nacional de La Plata
institution_str I-19
repository_str R-120
collection SEDICI (UNLP)
language Inglés
topic Ciencias Exactas
Metagenomics
Metatranscriptomics
Next Generation Sequencing
Homology Search
Taxonomic Profile
Functional Profile
spellingShingle Ciencias Exactas
Metagenomics
Metatranscriptomics
Next Generation Sequencing
Homology Search
Taxonomic Profile
Functional Profile
Rozadilla, Gastón
Moreiras Clemente, Jorgelina
McCarthy, Christina Beryl
HoSeIn: A Workflow for Integrating Various Homology Search Results from Metagenomic and Metatranscriptomic Sequence Datasets
topic_facet Ciencias Exactas
Metagenomics
Metatranscriptomics
Next Generation Sequencing
Homology Search
Taxonomic Profile
Functional Profile
description Data generated by metagenomic and metatranscriptomic experiments is both enormous and inherently noisy. When using taxonomy-dependent alignment-based methods to classify and label reads, the first step consists in performing homology searches against sequence databases. To obtain the most information from the samples, nucleotide sequences are usually compared to various databases (nucleotide and protein) using local sequence aligners such as BLASTN and BLASTX. Nevertheless, the analysis and integration of these results can be problematic because the outputs from these searches usually show inconsistencies, which can be notorious when working with RNA-seq. Moreover, and to the best of our knowledge, existing tools do not criss-cross and integrate information from the different homology searches, but provide the results of each analysis separately. We developed the HoSeIn workflow to intersect the information from these homology searches, and then determine the taxonomic and functional profile of the sample using this integrated information. The workflow is based on the assumption that the sequences that correspond to a certain taxon are composed of: sequences that were assigned to the same taxon by both homology searches; sequences that were assigned to that taxon by one of the homology searches but returned no hits in the other one.
format Articulo
Articulo
author Rozadilla, Gastón
Moreiras Clemente, Jorgelina
McCarthy, Christina Beryl
author_facet Rozadilla, Gastón
Moreiras Clemente, Jorgelina
McCarthy, Christina Beryl
author_sort Rozadilla, Gastón
title HoSeIn: A Workflow for Integrating Various Homology Search Results from Metagenomic and Metatranscriptomic Sequence Datasets
title_short HoSeIn: A Workflow for Integrating Various Homology Search Results from Metagenomic and Metatranscriptomic Sequence Datasets
title_full HoSeIn: A Workflow for Integrating Various Homology Search Results from Metagenomic and Metatranscriptomic Sequence Datasets
title_fullStr HoSeIn: A Workflow for Integrating Various Homology Search Results from Metagenomic and Metatranscriptomic Sequence Datasets
title_full_unstemmed HoSeIn: A Workflow for Integrating Various Homology Search Results from Metagenomic and Metatranscriptomic Sequence Datasets
title_sort hosein: a workflow for integrating various homology search results from metagenomic and metatranscriptomic sequence datasets
publishDate 2020
url http://sedici.unlp.edu.ar/handle/10915/132750
work_keys_str_mv AT rozadillagaston hoseinaworkflowforintegratingvarioushomologysearchresultsfrommetagenomicandmetatranscriptomicsequencedatasets
AT moreirasclementejorgelina hoseinaworkflowforintegratingvarioushomologysearchresultsfrommetagenomicandmetatranscriptomicsequencedatasets
AT mccarthychristinaberyl hoseinaworkflowforintegratingvarioushomologysearchresultsfrommetagenomicandmetatranscriptomicsequencedatasets
bdutipo_str Repositorios
_version_ 1764820454296518656