Imputation of Missing Values for Compositional Data Based on Principal Component Analysis
-
Graphical Abstract
-
Abstract
In this paper, considering of the special geometry of compositional data, two new methods for estimating missing values in compositional data are introduced. The first method uses the mean in the simplex space which mainly finds the -nearest neighbor procedure based on the Aitchison distance, combining with two basic operations on the simplex, perturbation and powering. As a second proposal the principal component regression imputation method is introduced which initially starts from the result of the proposed the mean in the simplex. The method uses ilr transformation to transform the compositional data set, and then uses principal component regression in a transformed space. The proposed methods are tested on real data and simulated data sets, the results show that the proposed methods work well.
-
-