In the framework of symbolic data analysis (SDA), distribution-valued data are defined as multivalued data, where each unit is described by a distribution (e.g., a histogram, a density, or a quantile function) of a quantitative variable. SDA provides different methods for analyzing multivalued data. Among them, the most relevant techniques proposed for a dimensional reduction of multivalued quantitative variables is principal component analysis (PCA). This paper gives a contribution in this context of analysis. Starting from new association measures for distributional variables based on a peculiar metric for distributions, the squared Wasserstein distance, a PCA approach is proposed for distribution-valued data, represented by quantile-variables. An application of the proposed PCA method, performed on simulated distribution-valued data, shows interesting interpretative results in terms of location, variability, and shape of the distributions on the factorial planes.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TCYB.2015.2389653 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!