This article studies the large-scale subspace clustering (LSC) problem with millions of data points. Many popular subspace clustering methods cannot directly handle the LSC problem although they have been considered to be state-of-the-art methods for small-scale data points. A simple reason is that these methods often choose all data points as a large dictionary to build huge coding models, which results in high time and space complexity. In this article, we develop a learnable subspace clustering paradigm to efficiently solve the LSC problem. The key concept is to learn a parametric function to partition the high-dimensional subspaces into their underlying low-dimensional subspaces instead of the computationally demanding classical coding models. Moreover, we propose a unified, robust, predictive coding machine (RPCM) to learn the parametric function, which can be solved by an alternating minimization algorithm. Besides, we provide a bounded contraction analysis of the parametric function. To the best of our knowledge, this article is the first work to efficiently cluster millions of data points among the subspace clustering methods. Experiments on million-scale data sets verify that our paradigm outperforms the related state-of-the-art methods in both efficiency and effectiveness.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TNNLS.2020.3040379 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!