Person re-identification (Re-ID) has become a hot research topic due to its widespread applications. Conducting person Re-ID in video sequences is a practical requirement, in which the crucial challenge is how to pursue a robust video representation based on spatial and temporal features. However, most of the previous methods only consider how to integrate part-level features in the spatio-temporal range, while how to model and generate the part-correlations is little exploited. In this paper, we propose a skeleton-based dynamic hypergraph framework, namely Skeletal Temporal Dynamic Hypergraph Neural Network (ST-DHGNN) for person Re-ID, which resorts to modeling the high-order correlations among various body parts based on a time series of skeletal information. Specifically, multi-shape and multi-scale patches are heuristically cropped from feature maps, constituting spatial representations in different frames. A joint-centered hypergraph and a bone-centered hypergraph are constructed in parallel from multiple body parts (i.e., head, trunk, and legs) with spatio-temporal multi-granularity in the entire video sequence, in which the graph vertices representing regional features and hyperedges denoting relationships. Dynamic hypergraph propagation containing the re-planning module and the hyperedge elimination module is proposed to better integrate features among vertices. Feature aggregation and attention mechanisms are also adopted to obtain a better video representation for person Re-ID. Experiments show that the proposed method performs significantly better than the state-of-the-art on three video-based person Re-ID datasets, including iLIDS-VID, PRID-2011, and MARS.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TIP.2023.3236144 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!