Publications by authors named "Hefeng Wu"

Video scene graph generation (VidSGG) aims to identify objects in visual scenes and infer their relationships for a given video. It requires not only a comprehensive understanding of each object scattered on the whole scene but also a deep dive into their temporal motions and interactions. Inherently, object pairs and their relationships enjoy spatial co-occurrence correlations within each image and temporal consistency/transition correlations across different images, which can serve as prior knowledge to facilitate VidSGG model learning and inference.

View Article and Find Full Text PDF

Facial expression recognition (FER) has received significant attention in the past decade with witnessed progress, but data inconsistencies among different FER datasets greatly hinder the generalization ability of the models learned on one dataset to another. Recently, a series of cross-domain FER algorithms (CD-FERs) have been extensively developed to address this issue. Although each declares to achieve superior performance, comprehensive and fair comparisons are lacking due to inconsistent choices of the source/target datasets and feature extractors.

View Article and Find Full Text PDF

Transition metal oxides with high theoretical capacities are widely investigated as potential anodes for alkali-metal ion batteries. However, the intrinsic conductivity deficiency and large volume changes during cycles result in poor cycling stability and low rate capabilities. Graphene has been widely used to support metal oxide for enhanced performance, but the cycling life is limited by the aggregation/collapse of active materials on graphene surface.

View Article and Find Full Text PDF

Recognizing multiple labels of an image is a practical yet challenging task, and remarkable progress has been achieved by searching for semantic regions and exploiting label dependencies. However, current works utilize RNN/LSTM to implicitly capture sequential region/label dependencies, which cannot fully explore mutual interactions among the semantic regions/labels and do not explicitly integrate label co-occurrences. In addition, these works require large amounts of training samples for each category, and they are unable to generalize to novel categories with limited samples.

View Article and Find Full Text PDF

In this work, we report a high-performance anode material created by rationally encapsulating multi-walled carbon nanotubes (MWNTs) within hollow FeO nanotubes followed by applying a carbon coating. When tested for lithium storage, as-prepared MWNT@hollow FeO@C coaxial nanotubes present high specific capacity, superior rate performance, and outstanding cycling stability. It is capable of delivering high capacities of 758 mA h g at 500th cycle at 0.

View Article and Find Full Text PDF

Recently, correlation filter (CF)-based tracking methods have attracted considerable attention because of their high-speed performance. However, distortion, which refers to the phenomenon that the correlation outputs of CF-based trackers are distorted, remains a major obstacle for these methods. In this paper, we propose a distortion-aware correlation filter framework, which can detect distortions and recover from tracking failures.

View Article and Find Full Text PDF

In this paper, we study a novel hierarchical background model for intelligent video surveillance with the pan-tilt-zoom (PTZ) camera, and give rise to an integrated system consisting of three key components: background modeling, observed frame registration, and object tracking. First, we build the hierarchical background model by separating the full range of continuous focal lengths of a PTZ camera into several discrete levels and then partitioning the wide scene at each level into many partial fixed scenes. In this way, the wide scenes captured by a PTZ camera through rotation and zoom are represented by a hierarchical collection of partial fixed scenes.

View Article and Find Full Text PDF