Publications by authors named "Xian-Sheng Hua"

Graph classification is a critical task in numerous multimedia applications, where graphs are employed to represent diverse types of multimedia data, including images, videos, and social networks. Nevertheless, in the real world, labeled graph data are always limited or scarce. To address this issue, we focus on the semi-supervised graph classification task, which involves both supervised and unsupervised models learning from labeled and unlabeled data.

View Article and Find Full Text PDF

Hashing has received significant interest in large-scale data retrieval due to its outstanding computational efficiency. Of late, numerous deep hashing approaches have emerged, which have obtained impressive performance. However, these approaches can contain ethical risks during image retrieval.

View Article and Find Full Text PDF

This paper delves into the problem of correlated time-series forecasting in practical applications, an area of growing interest in a multitude of fields such as stock price prediction and traffic demand analysis. Current methodologies primarily represent data using conventional graph structures, yet these fail to capture intricate structures with non-pairwise relationships. To address this challenge, we adopt dynamic hypergraphs in this study to better illustrate complex interactions, and introduce a novel hypergraph neural network model named CHNN for correlated time series forecasting.

View Article and Find Full Text PDF

The optical flow guidance strategy is ideal for obtaining motion information of objects in the video. It is widely utilized in video segmentation tasks. However, existing optical flow-based methods have a significant dependency on optical flow, which results in poor performance when the optical flow estimation fails for a particular scene.

View Article and Find Full Text PDF
Article Synopsis
  • This paper talks about a new method called PEACE that helps computers find images better, even when they come from different sources.
  • The problem it solves is that current methods sometimes guess incorrectly when tagging these images, which makes them less reliable.
  • PEACE improves how computers group and label these images by paying attention to their meanings and reducing errors caused by bad guesses.
View Article and Find Full Text PDF

Recently, unsupervised person re-identification (Re-ID) has received increasing research attention due to its potential for label-free applications. A promising way to address unsupervised Re-ID is clustering-based, which generates pseudo labels by clustering and uses the pseudo labels to train a Re-ID model iteratively. However, most clustering-based methods take each cluster as a pseudo identity class, neglecting the intra-cluster variance mainly caused by the change of cameras.

View Article and Find Full Text PDF

Graph neural networks (GNNs) are the most promising deep learning models that can revolutionize non-Euclidean data analysis. However, their full potential is severely curtailed by poorly represented molecular graphs and features. Here, we propose a multiphysical graph neural network (MP-GNN) model based on the developed multiphysical molecular graph representation and featurization.

View Article and Find Full Text PDF

This article studies self-supervised graph representation learning, which is critical to various tasks, such as protein property prediction. Existing methods typically aggregate representations of each individual node as graph representations, but fail to comprehensively explore local substructures (i.e.

View Article and Find Full Text PDF

Genome variant calling is a challenging yet critical task for subsequent studies. Existing methods almost rely on high depth DNA sequencing data. Performance on low depth data drops a lot.

View Article and Find Full Text PDF

Video Summarization (VS) has become one of the most effective solutions for quickly understanding a large volume of video data. Dictionary selection with self representation and sparse regularization has demonstrated its promise for VS by formulating the VS problem as a sparse selection task on video frames. However, existing dictionary selection models are generally designed only for data reconstruction, which results in the neglect of the inherent structured information among video frames.

View Article and Find Full Text PDF

Model fine-tuning is a widely used transfer learning approach in person Re-identification (ReID) applications, which fine-tuning a pre-trained feature extraction model into the target scenario instead of training a model from scratch. It is challenging due to the significant variations inside the target scenario, e.g.

View Article and Find Full Text PDF

Artificial intelligence (AI)-based drug design has great promise to fundamentally change the landscape of the pharmaceutical industry. Even though there are great progress from handcrafted feature-based machine learning models, 3D convolutional neural networks (CNNs) and graph neural networks, effective and efficient representations that characterize the structural, physical, chemical and biological properties of molecular structures and interactions remain to be a great challenge. Here, we propose an equal-sized molecular 2D image representation, known as the molecular persistent spectral image (Mol-PSI), and combine it with CNN model for AI-based drug design.

View Article and Find Full Text PDF

Video moment localization, as an important branch of video content analysis, has attracted extensive attention in recent years. However, it is still in its infancy due to the following challenges: cross-modal semantic alignment and localization efficiency. To address these impediments, we present a cross-modal semantic alignment network.

View Article and Find Full Text PDF

The re-identification (ReID) task has received increasing studies in recent years and its performance has gained significant improvement. The progress mainly comes from searching for new network structures to learn person representations. Most of these networks are trained using the classic stochastic gradient descent optimizer.

View Article and Find Full Text PDF

There is an increasing demand for interior design and decorating. The main challenges are where to put the objects and how to put them plausibly in the given domain. In this article, we propose an automatic method for decorating the planes in a given image.

View Article and Find Full Text PDF

Detecting objects in surveillance videos is an important problem due to its wide applications in traffic control and public security. Existing methods tend to face performance degradation because of false positive or misalignment problems. We propose a novel framework, namely, Foreground Gating and Background Refining Network (FG-BR Net), for surveillance object detection (SOD).

View Article and Find Full Text PDF

Vehicle detection is a challenging problem in autonomous driving systems, due to its large structural and appearance variations. In this paper, we propose a novel vehicle detection scheme based on multi-task deep convolutional neural networks (CNNs) and region-of-interest (RoI) voting. In the design of CNN architecture, we enrich the supervised information with subcategory, region overlap, bounding-box regression, and category of each training RoI as a multi-task learning framework.

View Article and Find Full Text PDF

We have witnessed the popularity of image-sharing websites for sharing personal experiences through photos on the Web. These websites allow users describing the content of their uploaded images with a set of tags. Those user-annotated tags are often noisy and biased.

View Article and Find Full Text PDF

In this paper, we propose regularized tree partitioning approaches. We study normalized cut (NCut) and average cut (ACut) criteria over a tree, forming two approaches: 1) normalized tree partitioning (NTP) and 2) average tree partitioning (ATP). We give the properties that result in an efficient algorithm for NTP and ATP.

View Article and Find Full Text PDF

We address the problem of approximate nearest neighbor (ANN) search for visual descriptor indexing. Most spatial partition trees, such as KD trees, VP trees, and so on, follow the hierarchical binary space partitioning framework. The key effort is to design different partition functions (hyperplane or hypersphere) to divide the points so that 1) the data points can be well grouped to support effective NN candidate location and 2) the partition functions can be quickly evaluated to support efficient NN candidate location.

View Article and Find Full Text PDF

We propose an automatic approximation of the intrinsic manifold for general semi-supervised learning (SSL) problems. Unfortunately, it is not trivial to define an optimization function to obtain optimal hyperparameters. Usually, cross validation is applied, but it does not necessarily scale up.

View Article and Find Full Text PDF

This paper proposes the Flickr Distance (FD) to measure the visual correlation between concepts. For each concept, a collection of related images are obtained from the Flickr website. We assume that each concept consists of several states, e.

View Article and Find Full Text PDF

Learning a satisfactory object detector generally requires sufficient training data to cover the most variations of the object. In this paper, we show that the performance of object detector is severely degraded when training examples are limited. We propose an approach to handle this issue by exploring a set of pretrained auxiliary detectors for other categories.

View Article and Find Full Text PDF

Most research on image decomposition, e.g., image segmentation and image parsing, has predominantly focused on the low-level visual clues within a single image and neglected the contextual information across images.

View Article and Find Full Text PDF