In this paper, we propose a novel network block, dubbed as second-order spectral transform block, for 3D shape retrieval and classification. This network block generalizes the second-order pooling to 3D surface by designing a learnable non-linear transform on the spectrum of the pooled descriptor. The proposed block consists of following two components. First, the second-order average (SO-Avr) and max-pooling (SOMax) operations are designed on 3D surface to aggregate local descriptors, which are shown to be more discriminative than the popular average-pooling or max-pooling. Second, a learnable spectral transform parameterized by mixture of power function is proposed to perform non-linear feature mapping in the space of pooled descriptors, i.e., manifold of symmetric positive definite matrix for SO-Avr, and space of symmetric matrix for SOMax. The proposed block can be plugged into existing network architectures to aggregate local shape descriptors for boosting their performance. We apply it to a shallow network for nonrigid 3D shape analysis and to existing networks for rigid shape analysis, where it improves the first-tier retrieval accuracy by 7.2% on SHREC'14 Real dataset and achieves state-of-the-art classification accuracy on ModelNet40. As an extension, we apply our block to 2D image classification, showing its superiority compared with traditional second-order pooling methods. We also provide theoretical and experimental analysis on stability of the proposed second-order spectral transform block.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TIP.2020.2967579 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!