Improving cluster visualization in self-organizing maps: application in gene expression data analysis

Cluster analysis is one of the crucial steps in gene expression pattern (GEP) analysis. It leads to the discovery or identification of temporal patterns and coexpressed genes. GEP analysis involves highly dimensional multivariate data which demand appropriate tools. A good alternative for grouping m...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Fernández, Elmer Andrés, Balzarini, Mónica
Formato: Artículo
Lenguaje:Español
Publicado: 2007
Materias:
Acceso en línea:http://pa.bibdigital.ucc.edu.ar/4014/1/A_Fern%C3%A1ndez_Balzarini.pdf
Aporte de:
id I38-R144-4014
record_format dspace
spelling I38-R144-40142025-04-10T17:32:06Z http://pa.bibdigital.ucc.edu.ar/4014/ Improving cluster visualization in self-organizing maps: application in gene expression data analysis Fernández, Elmer Andrés Balzarini, Mónica TA Ingeniería de asistencia técnica (General). Ingeniería Civil (General) Cluster analysis is one of the crucial steps in gene expression pattern (GEP) analysis. It leads to the discovery or identification of temporal patterns and coexpressed genes. GEP analysis involves highly dimensional multivariate data which demand appropriate tools. A good alternative for grouping many multidimensional objects is self-organizing maps (SOM), an unsupervised neural network algorithm able to find relationships among data. SOM groups and maps them topologically. However, it may be difficult to identify clusters with the usual visualization tools for SOM. We propose a simple algorithm to identify and visualize clusters in SOM (the RP-Q method). The RP is a new node-adaptive attribute that moves in a two dimensional virtual space imitating the movement of the codebooks vectors of the SOM net into the input space. The Q statistic evaluates the SOM structure providing an estimation of the number of clusters underlying the data set. The SOM-RP-Q algorithm permits the visualization of clusters in the SOM and their node patterns. The algorithm was evaluated in several simulated and real GEP data sets. Results show that the proposed algorithm successfully displays the underlying cluster structure directly from the SOM and is robust to different net sizes. 2007-12-31 info:eu-repo/semantics/article info:eu-repo/semantics/closedAccess application/pdf spa http://pa.bibdigital.ucc.edu.ar/4014/1/A_Fern%C3%A1ndez_Balzarini.pdf Fernández, Elmer Andrés ORCID: https://orcid.org/0000-0002-4711-8634 <https://orcid.org/0000-0002-4711-8634> and Balzarini, Mónica ORCID: https://orcid.org/0000-0002-4858-4637 <https://orcid.org/0000-0002-4858-4637> (2007) Improving cluster visualization in self-organizing maps: application in gene expression data analysis. Computers in Biology and Medicine, 37 (12). pp. 1677-1689. ISSN 0010-4825 info:eu-repo/semantics/altIdentifier/doi/10.1016/j.compbiomed.2007.04.003
institution Universidad Católica de Córdoba
institution_str I-38
repository_str R-144
collection Producción Académica Universidad Católica de Córdoba (UCCor)
language Español
orig_language_str_mv spa
topic TA Ingeniería de asistencia técnica (General). Ingeniería Civil (General)
spellingShingle TA Ingeniería de asistencia técnica (General). Ingeniería Civil (General)
Fernández, Elmer Andrés
Balzarini, Mónica
Improving cluster visualization in self-organizing maps: application in gene expression data analysis
topic_facet TA Ingeniería de asistencia técnica (General). Ingeniería Civil (General)
description Cluster analysis is one of the crucial steps in gene expression pattern (GEP) analysis. It leads to the discovery or identification of temporal patterns and coexpressed genes. GEP analysis involves highly dimensional multivariate data which demand appropriate tools. A good alternative for grouping many multidimensional objects is self-organizing maps (SOM), an unsupervised neural network algorithm able to find relationships among data. SOM groups and maps them topologically. However, it may be difficult to identify clusters with the usual visualization tools for SOM. We propose a simple algorithm to identify and visualize clusters in SOM (the RP-Q method). The RP is a new node-adaptive attribute that moves in a two dimensional virtual space imitating the movement of the codebooks vectors of the SOM net into the input space. The Q statistic evaluates the SOM structure providing an estimation of the number of clusters underlying the data set. The SOM-RP-Q algorithm permits the visualization of clusters in the SOM and their node patterns. The algorithm was evaluated in several simulated and real GEP data sets. Results show that the proposed algorithm successfully displays the underlying cluster structure directly from the SOM and is robust to different net sizes.
format Artículo
author Fernández, Elmer Andrés
Balzarini, Mónica
author_facet Fernández, Elmer Andrés
Balzarini, Mónica
author_sort Fernández, Elmer Andrés
title Improving cluster visualization in self-organizing maps: application in gene expression data analysis
title_short Improving cluster visualization in self-organizing maps: application in gene expression data analysis
title_full Improving cluster visualization in self-organizing maps: application in gene expression data analysis
title_fullStr Improving cluster visualization in self-organizing maps: application in gene expression data analysis
title_full_unstemmed Improving cluster visualization in self-organizing maps: application in gene expression data analysis
title_sort improving cluster visualization in self-organizing maps: application in gene expression data analysis
publishDate 2007
url http://pa.bibdigital.ucc.edu.ar/4014/1/A_Fern%C3%A1ndez_Balzarini.pdf
work_keys_str_mv AT fernandezelmerandres improvingclustervisualizationinselforganizingmapsapplicationingeneexpressiondataanalysis
AT balzarinimonica improvingclustervisualizationinselforganizingmapsapplicationingeneexpressiondataanalysis
_version_ 1832592428774719488