IEEE Trans Neural Netw Learn Syst
April 2024
Fully supervised semantic segmentation has performed well in many computer vision tasks. However, it is time-consuming because training a model requires a large number of pixel-level annotated samples. Few-shot segmentation has recently become a popular approach to addressing this problem, as it requires only a handful of annotated samples to generalize to new categories.
View Article and Find Full Text PDFLanguage-based person search retrieves images of a target person using natural language description and is a challenging fine-grained cross-modal retrieval task. A novel hybrid attention network is proposed for the task. The network includes the following three aspects: First, a cubic attention mechanism for person image, which combines cross-layer spatial attention and channel attention.
View Article and Find Full Text PDFAs a result of its important role in video surveillance, pedestrian attribute recognition has become an attractive facet of computer vision research. Because of the changes in viewpoints, illumination, resolution and occlusion, the task is very challenging. In order to resolve the issue of unsatisfactory performance of existing pedestrian attribute recognition methods resulting from ignoring the correlation between pedestrian attributes and spatial information, in this paper, the task is regarded as a spatiotemporal, sequential, multi-label image classification problem.
View Article and Find Full Text PDF