Ranking data, which result from m raters ranking n items, are difficult to visualize due to their discrete algebraic structure, and the computational difficulties associated with them when n is large. This problem becomes worse when raters provide tied rankings or not all items are ranked. We develop an approach for the visualization of ranking data for large n which is intuitive, easy to use, and computationally efficient.
View Article and Find Full Text PDFDocuments and other categorical valued time series are often characterized by the frequencies of short range sequential patterns such as n-grams. This representation converts sequential data of varying lengths to high dimensional histogram vectors which are easily modeled by standard statistical models. Unfortunately, the histogram representation ignores most of the medium and long range sequential dependencies making it unsuitable for visualizing sequential data.
View Article and Find Full Text PDFMany algorithms in machine learning rely on being given a good distance metric over the input space. Rather than using a default metric such as the Euclidean metric, it is desirable to obtain a metric based on the provided data. We consider the problem of learning a Riemannian metric associated with a given differentiable manifold and a set of points.
View Article and Find Full Text PDF