Taxonomic identity resolution of highly phylogenetically related strains and selection of phylogenetic markers by using genome-scale methods: the Bacillus pumilus group case
Bacillus pumilus group strains have been studied due their agronomic, biotechnological or pharmaceutical potential. Classifying strains of this taxonomic group at species level is a challenging procedure since it is composed of seven species that share among them over 99.5% of 16S rRNA gene ident...
Guardado en:
Autores principales: | , , , |
---|---|
Formato: | publishedVersion |
Lenguaje: | Inglés |
Publicado: |
Public Library of Science (PLOS)
2021
|
Materias: | |
Acceso en línea: | http://hdl.handle.net/2133/20199 http://hdl.handle.net/2133/20199 |
Aporte de: |
id |
I15-R121-2133-20199 |
---|---|
record_format |
dspace |
institution |
Universidad Nacional de Rosario |
institution_str |
I-15 |
repository_str |
R-121 |
collection |
Repositorio Hipermedial de la Universidad Nacional de Rosario (UNR) |
language |
Inglés |
orig_language_str_mv |
eng |
topic |
Bacillus pumilus Classification Computational Biology Random Forest Algorithm Phylogeny |
spellingShingle |
Bacillus pumilus Classification Computational Biology Random Forest Algorithm Phylogeny Espariz, Martín Zuljan, Federico A. Esteban, Luis Magni, Christian Taxonomic identity resolution of highly phylogenetically related strains and selection of phylogenetic markers by using genome-scale methods: the Bacillus pumilus group case |
topic_facet |
Bacillus pumilus Classification Computational Biology Random Forest Algorithm Phylogeny |
description |
Bacillus pumilus group strains have been studied due their agronomic, biotechnological or
pharmaceutical potential. Classifying strains of this taxonomic group at species level is a
challenging procedure since it is composed of seven species that share among them over
99.5% of 16S rRNA gene identity. In this study, first, a whole-genome in silico approach
was used to accurately demarcate B. pumilus group strains, as a case of highly phylogenet ically related taxa, at the species level. In order to achieve that and consequently to validate
or correct taxonomic identities of genomes in public databases, an average nucleotide
identity correlation, a core-based phylogenomic and a gene function repertory analyses
were performed. Eventually, more than 50% such genomes were found to be misclassified.
Hierarchical clustering of gene functional repertoires was also used to infer ecotypes
among B. pumilus group species. Furthermore, for the first time the machine-learning algorithm Random Forest was used to rank genes in order of their importance for species classification. We found that ybbP, a gene involved in the synthesis of cyclic di-AMP, was the
most important gene for accurately predicting species identity among B. pumilus group
strains. Finally, principal component analysis was used to classify strains based on the dis tances between their ybbP genes. The methodologies described could be utilized more
broadly to identify other highly phylogenetically related species in metagenomic or epidemiological assessments |
format |
publishedVersion |
author |
Espariz, Martín Zuljan, Federico A. Esteban, Luis Magni, Christian |
author_facet |
Espariz, Martín Zuljan, Federico A. Esteban, Luis Magni, Christian |
author_sort |
Espariz, Martín |
title |
Taxonomic identity resolution of highly phylogenetically related strains and selection of phylogenetic markers by using genome-scale methods: the Bacillus pumilus group case |
title_short |
Taxonomic identity resolution of highly phylogenetically related strains and selection of phylogenetic markers by using genome-scale methods: the Bacillus pumilus group case |
title_full |
Taxonomic identity resolution of highly phylogenetically related strains and selection of phylogenetic markers by using genome-scale methods: the Bacillus pumilus group case |
title_fullStr |
Taxonomic identity resolution of highly phylogenetically related strains and selection of phylogenetic markers by using genome-scale methods: the Bacillus pumilus group case |
title_full_unstemmed |
Taxonomic identity resolution of highly phylogenetically related strains and selection of phylogenetic markers by using genome-scale methods: the Bacillus pumilus group case |
title_sort |
taxonomic identity resolution of highly phylogenetically related strains and selection of phylogenetic markers by using genome-scale methods: the bacillus pumilus group case |
publisher |
Public Library of Science (PLOS) |
publishDate |
2021 |
url |
http://hdl.handle.net/2133/20199 http://hdl.handle.net/2133/20199 |
work_keys_str_mv |
AT esparizmartin taxonomicidentityresolutionofhighlyphylogeneticallyrelatedstrainsandselectionofphylogeneticmarkersbyusinggenomescalemethodsthebacilluspumilusgroupcase AT zuljanfedericoa taxonomicidentityresolutionofhighlyphylogeneticallyrelatedstrainsandselectionofphylogeneticmarkersbyusinggenomescalemethodsthebacilluspumilusgroupcase AT estebanluis taxonomicidentityresolutionofhighlyphylogeneticallyrelatedstrainsandselectionofphylogeneticmarkersbyusinggenomescalemethodsthebacilluspumilusgroupcase AT magnichristian taxonomicidentityresolutionofhighlyphylogeneticallyrelatedstrainsandselectionofphylogeneticmarkersbyusinggenomescalemethodsthebacilluspumilusgroupcase |
bdutipo_str |
Repositorios |
_version_ |
1764820410819411968 |