Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices
The problem of incomplete data matrices is repeatedly found in large databases; posing a significant obstacle for an effective treatment of data. This paper examines a self-organizing-map (SOM) based method of data imputation under the concept of distance object per one weight; to predict physicoche...
Guardado en:
| Autores principales: | , , , |
|---|---|
| Formato: | publishedVersion Artículo |
| Lenguaje: | Inglés |
| Publicado: |
Elsevier Science Bv
2015
|
| Materias: | |
| Acceso en línea: | https://ri.unsam.edu.ar/handle/123456789/1009 |
| Aporte de: |
| id |
I78-R216-123456789-1009 |
|---|---|
| record_format |
dspace |
| spelling |
I78-R216-123456789-10092023-03-27T21:04:50Z Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices Folguera, Laura Zupan, Jure Cicerone, Daniel Magallanes, Jorge CHEMOMETRICS ARTIFICIAL NEURAL NETWORK SELF-ORGANIZING MAPS MISSING DATA IMPUTATION ENVIRONMENTAL DATA SET CIENCIAS QUÍMICAS CIENCIAS EXACTAS Y NATURALES info:eu-repo/semantics/publishedVersion The problem of incomplete data matrices is repeatedly found in large databases; posing a significant obstacle for an effective treatment of data. This paper examines a self-organizing-map (SOM) based method of data imputation under the concept of distance object per one weight; to predict physicochemical parameters of water samples in a data set where concentrations of different analytes were missed. The method was evaluated according to two different possibilities: (a) including vectors of samples with and without missing data in the training data set and (b) pre-training a SOM for a data set with no missing values and then making imputations for a second data set (prediction set) of samples with missing values. Evaluations were made using a surface water data set of 270 samples from Reconquista River; in Buenos Aires Province; Argentina; by artificially setting a range of 17% to 39% of the data to missing. Results were compared to imputations made through professional criteria. SOMs gave reasonable estimates; with no statistically significant differences from estimates made through professional criteria; proving thus to be a suitable time-saving imputation method. Fil: Laura Folguera. Universidad Nacional de San Martín. Instituto de Investigación e Ingeniería Ambiental; Buenos Aires. Argentina. Fil: Jure Zupan. National Institute of Chemistry; Ljubljana. Slovenia. Fil: Daniel Cicerone. Universidad Nacional de San Martín. Instituto de Investigación e Ingeniería Ambiental; Buenos Aires. Argentina. Fil: Jorge Magallanes. Universidad Nacional de San Martín. Instituto de Investigación e Ingeniería Ambiental; Buenos Aires. Argentina. 2015-03 info:eu-repo/semantics/article info:ar-repo/semantics/artículo Folguera, L. et al (2015). Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices. En: Chemometrics and Intelligent Laboratory Systems. Elsevier Science 143, 146-151 0169-7439 https://ri.unsam.edu.ar/handle/123456789/1009 eng info:eu-repo/semantics/restrictedAccess http://creativecommons.org/licenses/by-nc-sa/2.5/ar/ Creative Commons Atribución-NoComercial-CompartirIgual 2.5 Argentina (CC BY-NC-SA 2.5) application/pdf pp. 146-151 application/pdf Elsevier Science Bv Chemometrics and Intelligent Laboratory Systems. 143: 146-151 (2015) Elsevier B.V. http://dx.doi.org/10.1016/j.chemolab.2015.03.002 |
| institution |
Universidad Nacional de General San Martín |
| institution_str |
I-78 |
| repository_str |
R-216 |
| collection |
Repositorio Institucional de la UNSAM |
| language |
Inglés |
| topic |
CHEMOMETRICS ARTIFICIAL NEURAL NETWORK SELF-ORGANIZING MAPS MISSING DATA IMPUTATION ENVIRONMENTAL DATA SET CIENCIAS QUÍMICAS CIENCIAS EXACTAS Y NATURALES |
| spellingShingle |
CHEMOMETRICS ARTIFICIAL NEURAL NETWORK SELF-ORGANIZING MAPS MISSING DATA IMPUTATION ENVIRONMENTAL DATA SET CIENCIAS QUÍMICAS CIENCIAS EXACTAS Y NATURALES Folguera, Laura Zupan, Jure Cicerone, Daniel Magallanes, Jorge Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices |
| topic_facet |
CHEMOMETRICS ARTIFICIAL NEURAL NETWORK SELF-ORGANIZING MAPS MISSING DATA IMPUTATION ENVIRONMENTAL DATA SET CIENCIAS QUÍMICAS CIENCIAS EXACTAS Y NATURALES |
| description |
The problem of incomplete data matrices is repeatedly found in large databases; posing a significant obstacle for an effective treatment of data. This paper examines a self-organizing-map (SOM) based method of data imputation under the concept of distance object per one weight; to predict physicochemical parameters of water samples in a data set where concentrations of different analytes were missed. The method was evaluated according to two different possibilities: (a) including vectors of samples with and without missing data in the training data set and (b) pre-training a SOM for a data set with no missing values and then making imputations for a second data set (prediction set) of samples with missing values. Evaluations were made using a surface water data set of 270 samples from Reconquista River; in Buenos Aires Province; Argentina; by artificially setting a range of 17% to 39% of the data to missing. Results were compared to imputations made through professional criteria. SOMs gave reasonable estimates; with no statistically significant differences from estimates made through professional criteria; proving thus to be a suitable time-saving imputation method. |
| format |
publishedVersion Artículo Artículo |
| author |
Folguera, Laura Zupan, Jure Cicerone, Daniel Magallanes, Jorge |
| author_facet |
Folguera, Laura Zupan, Jure Cicerone, Daniel Magallanes, Jorge |
| author_sort |
Folguera, Laura |
| title |
Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices |
| title_short |
Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices |
| title_full |
Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices |
| title_fullStr |
Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices |
| title_full_unstemmed |
Self-Organizing Maps for Imputation of Missing Data in Incomplete Data Matrices |
| title_sort |
self-organizing maps for imputation of missing data in incomplete data matrices |
| publisher |
Elsevier Science Bv |
| publishDate |
2015 |
| url |
https://ri.unsam.edu.ar/handle/123456789/1009 |
| work_keys_str_mv |
AT folgueralaura selforganizingmapsforimputationofmissingdatainincompletedatamatrices AT zupanjure selforganizingmapsforimputationofmissingdatainincompletedatamatrices AT ciceronedaniel selforganizingmapsforimputationofmissingdatainincompletedatamatrices AT magallanesjorge selforganizingmapsforimputationofmissingdatainincompletedatamatrices |
| _version_ |
1765722562247000064 |