Detecting influential observations in principal components and common principal components

Detecting outlying observations is an important step in any analysis, even when robust estimates are used. In particular, the robustified Mahalanobis distance is a natural measure of outlyingness if one focuses on ellipsoidal distributions. However, it is well known that the asymptotic chi-square ap...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Boente, G., Pires, A.M., Rodrigues, I.M.
Formato: JOUR
Materias:
Acceso en línea:http://hdl.handle.net/20.500.12110/paper_01679473_v54_n12_p2967_Boente
Aporte de:
id todo:paper_01679473_v54_n12_p2967_Boente
record_format dspace
spelling todo:paper_01679473_v54_n12_p2967_Boente2023-10-03T15:05:37Z Detecting influential observations in principal components and common principal components Boente, G. Pires, A.M. Rodrigues, I.M. Common principal components Detection of outliers Influence functions Robust estimation Multivariable systems Normal distribution Asymptotic distributions Influence functions Influential observations Mahalanobis distances Minimum covariance determinant Minimum volume ellipsoids Principal Components Robust estimation Method of moments Detecting outlying observations is an important step in any analysis, even when robust estimates are used. In particular, the robustified Mahalanobis distance is a natural measure of outlyingness if one focuses on ellipsoidal distributions. However, it is well known that the asymptotic chi-square approximation for the cutoff value of the Mahalanobis distance based on several robust estimates (like the minimum volume ellipsoid, the minimum covariance determinant and the S-estimators) is not adequate for detecting atypical observations in small samples from the normal distribution. In the multi-population setting and under a common principal components model, aggregated measures based on standardized empirical influence functions are used to detect observations with a significant impact on the estimators. As in the one-population setting, the cutoff values obtained from the asymptotic distribution of those aggregated measures are not adequate for small samples. More appropriate cutoff values, adapted to the sample sizes, can be computed by using a cross-validation approach. Cutoff values obtained from a Monte Carlo study using S-estimators are provided for illustration. A real data set is also analyzed. © 2010 Elsevier B.V. All rights reserved. Fil:Boente, G. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina. JOUR info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by/2.5/ar http://hdl.handle.net/20.500.12110/paper_01679473_v54_n12_p2967_Boente
institution Universidad de Buenos Aires
institution_str I-28
repository_str R-134
collection Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA)
topic Common principal components
Detection of outliers
Influence functions
Robust estimation
Multivariable systems
Normal distribution
Asymptotic distributions
Influence functions
Influential observations
Mahalanobis distances
Minimum covariance determinant
Minimum volume ellipsoids
Principal Components
Robust estimation
Method of moments
spellingShingle Common principal components
Detection of outliers
Influence functions
Robust estimation
Multivariable systems
Normal distribution
Asymptotic distributions
Influence functions
Influential observations
Mahalanobis distances
Minimum covariance determinant
Minimum volume ellipsoids
Principal Components
Robust estimation
Method of moments
Boente, G.
Pires, A.M.
Rodrigues, I.M.
Detecting influential observations in principal components and common principal components
topic_facet Common principal components
Detection of outliers
Influence functions
Robust estimation
Multivariable systems
Normal distribution
Asymptotic distributions
Influence functions
Influential observations
Mahalanobis distances
Minimum covariance determinant
Minimum volume ellipsoids
Principal Components
Robust estimation
Method of moments
description Detecting outlying observations is an important step in any analysis, even when robust estimates are used. In particular, the robustified Mahalanobis distance is a natural measure of outlyingness if one focuses on ellipsoidal distributions. However, it is well known that the asymptotic chi-square approximation for the cutoff value of the Mahalanobis distance based on several robust estimates (like the minimum volume ellipsoid, the minimum covariance determinant and the S-estimators) is not adequate for detecting atypical observations in small samples from the normal distribution. In the multi-population setting and under a common principal components model, aggregated measures based on standardized empirical influence functions are used to detect observations with a significant impact on the estimators. As in the one-population setting, the cutoff values obtained from the asymptotic distribution of those aggregated measures are not adequate for small samples. More appropriate cutoff values, adapted to the sample sizes, can be computed by using a cross-validation approach. Cutoff values obtained from a Monte Carlo study using S-estimators are provided for illustration. A real data set is also analyzed. © 2010 Elsevier B.V. All rights reserved.
format JOUR
author Boente, G.
Pires, A.M.
Rodrigues, I.M.
author_facet Boente, G.
Pires, A.M.
Rodrigues, I.M.
author_sort Boente, G.
title Detecting influential observations in principal components and common principal components
title_short Detecting influential observations in principal components and common principal components
title_full Detecting influential observations in principal components and common principal components
title_fullStr Detecting influential observations in principal components and common principal components
title_full_unstemmed Detecting influential observations in principal components and common principal components
title_sort detecting influential observations in principal components and common principal components
url http://hdl.handle.net/20.500.12110/paper_01679473_v54_n12_p2967_Boente
work_keys_str_mv AT boenteg detectinginfluentialobservationsinprincipalcomponentsandcommonprincipalcomponents
AT piresam detectinginfluentialobservationsinprincipalcomponentsandcommonprincipalcomponents
AT rodriguesim detectinginfluentialobservationsinprincipalcomponentsandcommonprincipalcomponents
_version_ 1782030614555262976