Multivariate location and scatter matrix estimation under cellwise and casewise contamination
Real data may contain both cellwise outliers and casewise outliers. There is a vast literature on robust estimation for casewise outliers, but only a scant literature for cellwise outliers and almost none for both types of outliers. Estimation of multivariate location and scatter matrix is a corner...
Guardado en:
Autores principales: | , , |
---|---|
Formato: | JOUR |
Materias: | |
Acceso en línea: | http://hdl.handle.net/20.500.12110/paper_01679473_v111_n_p59_Leung |
Aporte de: |
id |
todo:paper_01679473_v111_n_p59_Leung |
---|---|
record_format |
dspace |
spelling |
todo:paper_01679473_v111_n_p59_Leung2023-10-03T15:05:34Z Multivariate location and scatter matrix estimation under cellwise and casewise contamination Leung, A. Yohai, V. Zamar, R. Cellwise outliers Componentwise contamination Multivariate location and scatter Robust estimation Location Matrix algebra Multivariant analysis Cellwise outliers Componentwise Multivariate data analysis Robust estimation Robust procedures Simulation studies Two-step approach Two-step procedure Statistics Real data may contain both cellwise outliers and casewise outliers. There is a vast literature on robust estimation for casewise outliers, but only a scant literature for cellwise outliers and almost none for both types of outliers. Estimation of multivariate location and scatter matrix is a corner stone in multivariate data analysis. A two-step approach was recently proposed to perform robust estimation of multivariate location and scatter matrix in the presence of cellwise and casewise outliers. In the first step a univariate filter was applied to remove cellwise outliers. In the second step a generalized S-estimator was used to downweight casewise outliers. This proposal can be further improved in three main directions. First, through the introduction of a consistent bivariate filter to be used in combination with the univariate filter in the first step. Second, through the proposal of a new fast subsampling procedure to generate starting points for the generalized S-estimator in the second step. Third, through the use of a non-monotonic weight function for the generalized S-estimator to better handle casewise outliers in high dimension. A simulation study and a real data example show that, unlike the original two-step procedure, the modified two-step approach performs and scales well in high dimension. Moreover, they show that the modified procedure outperforms the original one and other state-of-the-art robust procedures under cellwise and casewise data contamination. © 2017 Elsevier B.V. JOUR info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by/2.5/ar http://hdl.handle.net/20.500.12110/paper_01679473_v111_n_p59_Leung |
institution |
Universidad de Buenos Aires |
institution_str |
I-28 |
repository_str |
R-134 |
collection |
Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA) |
topic |
Cellwise outliers Componentwise contamination Multivariate location and scatter Robust estimation Location Matrix algebra Multivariant analysis Cellwise outliers Componentwise Multivariate data analysis Robust estimation Robust procedures Simulation studies Two-step approach Two-step procedure Statistics |
spellingShingle |
Cellwise outliers Componentwise contamination Multivariate location and scatter Robust estimation Location Matrix algebra Multivariant analysis Cellwise outliers Componentwise Multivariate data analysis Robust estimation Robust procedures Simulation studies Two-step approach Two-step procedure Statistics Leung, A. Yohai, V. Zamar, R. Multivariate location and scatter matrix estimation under cellwise and casewise contamination |
topic_facet |
Cellwise outliers Componentwise contamination Multivariate location and scatter Robust estimation Location Matrix algebra Multivariant analysis Cellwise outliers Componentwise Multivariate data analysis Robust estimation Robust procedures Simulation studies Two-step approach Two-step procedure Statistics |
description |
Real data may contain both cellwise outliers and casewise outliers. There is a vast literature on robust estimation for casewise outliers, but only a scant literature for cellwise outliers and almost none for both types of outliers. Estimation of multivariate location and scatter matrix is a corner stone in multivariate data analysis. A two-step approach was recently proposed to perform robust estimation of multivariate location and scatter matrix in the presence of cellwise and casewise outliers. In the first step a univariate filter was applied to remove cellwise outliers. In the second step a generalized S-estimator was used to downweight casewise outliers. This proposal can be further improved in three main directions. First, through the introduction of a consistent bivariate filter to be used in combination with the univariate filter in the first step. Second, through the proposal of a new fast subsampling procedure to generate starting points for the generalized S-estimator in the second step. Third, through the use of a non-monotonic weight function for the generalized S-estimator to better handle casewise outliers in high dimension. A simulation study and a real data example show that, unlike the original two-step procedure, the modified two-step approach performs and scales well in high dimension. Moreover, they show that the modified procedure outperforms the original one and other state-of-the-art robust procedures under cellwise and casewise data contamination. © 2017 Elsevier B.V. |
format |
JOUR |
author |
Leung, A. Yohai, V. Zamar, R. |
author_facet |
Leung, A. Yohai, V. Zamar, R. |
author_sort |
Leung, A. |
title |
Multivariate location and scatter matrix estimation under cellwise and casewise contamination |
title_short |
Multivariate location and scatter matrix estimation under cellwise and casewise contamination |
title_full |
Multivariate location and scatter matrix estimation under cellwise and casewise contamination |
title_fullStr |
Multivariate location and scatter matrix estimation under cellwise and casewise contamination |
title_full_unstemmed |
Multivariate location and scatter matrix estimation under cellwise and casewise contamination |
title_sort |
multivariate location and scatter matrix estimation under cellwise and casewise contamination |
url |
http://hdl.handle.net/20.500.12110/paper_01679473_v111_n_p59_Leung |
work_keys_str_mv |
AT leunga multivariatelocationandscattermatrixestimationundercellwiseandcasewisecontamination AT yohaiv multivariatelocationandscattermatrixestimationundercellwiseandcasewisecontamination AT zamarr multivariatelocationandscattermatrixestimationundercellwiseandcasewisecontamination |
_version_ |
1782029333968191488 |