In the framework of symbolic data analysis (SDA), distribution-valued data are defined as multivalued data, where each unit is described by a distribution (e.g., a histogram, a density, or a quantile function) of a quantitative variable. SDA provides different methods for analyzing multivalued data. Among them, the most relevant techniques proposed for a dimensional reduction of multivalued quantitative variables is principal component analysis (PCA). This paper gives a contribution in this context of analysis. Starting from new association measures for distributional variables based on a peculiar metric for distributions, the squared Wasserstein distance, a PCA approach is proposed for distribution-valued data, represented by quantile-variables. An application of the proposed PCA method, performed on simulated distribution-valued data, shows interesting interpretative results in terms of location, variability, and shape of the distributions on the factorial planes.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TCYB.2015.2389653DOI Listing

Publication Analysis

Top Keywords

distribution-valued data
12
symbolic data
8
multivalued data
8
data
7
dimension reduction
4
reduction techniques
4
techniques distributional
4
distributional symbolic
4
data framework
4
framework symbolic
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!