Spectrochim Acta A Mol Biomol Spectrosc
December 2024
Due to the high-dimensionality, redundancy, and non-linearity of the near-infrared (NIR) spectra data, as well as the influence of attributes such as producing area and grade of the sample, which can all affect the similarity measure between samples. This paper proposed a t-distributed stochastic neighbor embedding algorithm based on Sinkhorn distance (St-SNE) combined with multi-attribute data information. Firstly, the Sinkhorn distance was introduced which can solve problems such as KL divergence asymmetry and sparse data distribution in high-dimensional space, thereby constructing probability distributions that make low-dimensional space similar to high-dimensional space.
View Article and Find Full Text PDF