In recent years, self-supervised learning has emerged as a powerful approach to learning visual representations without requiring extensive manual annotation. One popular technique involves using rotation transformations of images, which provide a clear visual signal for learning semantic representation. However, in this work, we revisit the pretext task of predicting image rotation in self-supervised learning and discover that it tends to marginalise the perception of features located near the centre of an image.
View Article and Find Full Text PDFIEEE Trans Image Process
January 2024
Early action prediction (EAP) aims to recognize human actions from a part of action execution in ongoing videos, which is an important task for many practical applications. Most prior works treat partial or full videos as a whole, ignoring rich action knowledge hidden in videos, i.e.
View Article and Find Full Text PDFIEEE J Biomed Health Inform
February 2024
IEEE Trans Neural Netw Learn Syst
October 2024
Human motion prediction is challenging due to the complex spatiotemporal feature modeling. Among all methods, graph convolution networks (GCNs) are extensively utilized because of their superiority in explicit connection modeling. Within a GCN, the graph correlation adjacency matrix drives feature aggregation, and thus, is the key to extracting predictive motion features.
View Article and Find Full Text PDFIEEE Trans Image Process
June 2022
Video-based human pose estimation (VHPE) is a vital yet challenging task. While deep learning algorithms have made tremendous progress for the VHPE, lots of these approaches to this task implicitly model the long-range interaction between joints by expanding the receptive field of the convolution or designing a graph manually. Unlike prior methods, we design a lightweight and plug-and-play joint relation extractor (JRE) to explicitly and automatically model the associative relationship between joints.
View Article and Find Full Text PDF