Environmental sciences usually deal with compositional (closed) data. Whenever the concentration of chemical elements is measured, the data will be closed, i.e. the relevant information is contained in the ratios between the variables rather than in the data values reported for the variables. Data closure has severe consequences for statistical data analysis. Most classical statistical methods are based on the usual Euclidean geometry - compositional data, however, do not plot into Euclidean space because they have their own geometry which is not linear but curved in the Euclidean sense. This has severe consequences for bivariate statistical analysis: correlation coefficients computed in the traditional way are likely to be misleading, and the information contained in scatterplots must be used and interpreted differently from sets of non-compositional data. As a solution, the ilr transformation applied to a variable pair can be used to display the relationship and to compute a measure of stability. This paper discusses how this measure is related to the usual correlation coefficient and how it can be used and interpreted. Moreover, recommendations are provided for how the scatterplot can still be used, and which alternatives exist for displaying the relationship between two variables.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.scitotenv.2010.05.011 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!