Font Size:

Covariance-Based Outlier Detection for Compositional Data with Structural Zeros: Application to Italian Survey of Household Income and Wealth Data

Last modified: 2013-06-16

#### Abstract

Outlier detection is an important task for the statistical analysis of multivariate data, because often the outliers contain important information about the data structure. In compositional data, represented usually as proportions subject to a unit sum constraint, the ratios between the parts (variables) contain the essential information. This inherent property is, however, incompatible with the presence of zeros in compositions. Here we consider structural zeros, i.e., zeros that are truly observed, and not zeros related to measurement errors (rounded zeros). In order to identify possible outliers in compositional data with structural zeros, we apply the Mahalanobis distance approach, where the key task is a robust estimation of the covariance matrix. This resulting outlier detection procedure is applied to the Italian Survey of Household Income and Wealth (SHIW) data, collected by the Bank of Italy.

Full Text:
PDF