IEEE Trans Image Process
June 2024
Inferring 3D human motion is fundamental in many applications, including understanding human activity and analyzing one's intention. While many fruitful efforts have been made to human motion prediction, most approaches focus on pose-driven prediction and inferring human motion in isolation from the contextual environment, thus leaving the body location movement in the scene behind. However, real-world human movements are goal-directed and highly influenced by the spatial layout of their surrounding scenes.
View Article and Find Full Text PDFIEEE Trans Vis Comput Graph
December 2024
In this article, we propose a novel cascaded diffusion-based generative framework for text-driven human motion synthesis, which exploits a strategy named GradUally Enriching SyntheSis (GUESS as its abbreviation). The strategy sets up generation objectives by grouping body joints of detailed skeletons in close semantic proximity together and then replacing each of such joint group with a single body-part node. Such an operation recursively abstracts a human pose to coarser and coarser skeletons at multiple granularity levels.
View Article and Find Full Text PDFIn the 3D skeleton-based action recognition task, learning rich spatial and temporal motion patterns from body joints are two foundational yet under-explored problems. In this paper, we propose two methods for improving these problems: (I) a novel glimpse-focus action recognition strategy that captures multi-range pose features from the whole body and key body parts jointly; (II) a powerful temporal feature extractor JD-TC that enriches trajectory features by inferring different inter-frame correlations for different joints. By coupling these two proposals, we develop a powerful skeleton-based action recognition system that extracts rich pose and trajectory features from a skeleton sequence and outperforms previous state-of-the-art methods on three large-scale datasets.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
September 2024
Graph convolution networks (GCNs) have been widely used and achieved fruitful progress in the skeleton-based action recognition task. In GCNs, node interaction modeling dominates the context aggregation and, therefore, is crucial for a graph-based convolution kernel to extract representative features. In this article, we introduce a closer look at a powerful graph convolution formulation to capture rich movement patterns from these skeleton-based graphs.
View Article and Find Full Text PDF