Publications by authors named "Sheng-Jun Huang"

Continual learning (CL) provides a framework for training models in ever-evolving environments. Although re-occurrence of previously seen objects or tasks is common in real-world problems, the concept of repetition in the data stream is not often considered in standard benchmarks for CL. Unlike with the rehearsal mechanism in buffer-based strategies, where sample repetition is controlled by the strategy, repetition in the data stream naturally stems from the environment.

View Article and Find Full Text PDF

Class-conditional noise commonly exists in machine learning tasks, where the class label is corrupted with a probability depending on its ground-truth. Many research efforts have been made to improve the model robustness against the class-conditional noise. However, they typically focus on the single label case by assuming that only one label is corrupted.

View Article and Find Full Text PDF

Object detection requires plentiful data annotated with bounding boxes for model training. However, in many applications, it is difficult or even impossible to acquire a large set of labeled examples for the target task due to the privacy concern or lack of reliable annotators. On the other hand, due to the high-quality image search engines, such as Flickr and Google, it is relatively easy to obtain resource-rich unlabeled datasets, whose categories are a superset of those of target data.

View Article and Find Full Text PDF

Zero-shot learning (ZSL) aims to learn a classifier for unseen classes by exploiting both training data from seen classes and external knowledge. In many visual tasks such as image classification, a set of high-level attributes that describe the semantic properties of classes are used as the external knowledge to bridge seen and unseen classes. While the attributes are usually treated equally by previous ZSL studies, we observe that the contribution of different attributes varies significantly over model training.

View Article and Find Full Text PDF

Partial multi-label learning (PML) deals with problems where each instance is assigned with a candidate label set, which contains multiple relevant labels and some noisy labels. Recent studies usually solve PML problems with the disambiguation strategy, which recovers ground-truth labels from the candidate label set by simply assuming that the noisy labels are generated randomly. In real applications, however, noisy labels are usually caused by some ambiguous contents of the example.

View Article and Find Full Text PDF

In real-world recognition/classification tasks, limited by various objective factors, it is usually difficult to collect training samples to exhaust all classes when training a recognizer or classifier. A more realistic scenario is open set recognition (OSR), where incomplete knowledge of the world exists at training time, and unknown classes can be submitted to an algorithm during testing, requiring the classifiers to not only accurately classify the seen classes, but also effectively deal with unseen ones. This paper provides a comprehensive survey of existing open set recognition techniques covering various aspects ranging from related definitions, representations of models, datasets, evaluation criteria, and algorithm comparisons.

View Article and Find Full Text PDF

Segmenting bioimage based filaments is a critical step in a wide range of applications, including neuron reconstruction and blood vessel tracing. To achieve an acceptable segmentation performance, most of the existing methods need to annotate amounts of filamentary images in the training stage. Hence, these methods have to face the common challenge that the annotation cost is usually high.

View Article and Find Full Text PDF
Fast Multi-Instance Multi-Label Learning.

IEEE Trans Pattern Anal Mach Intell

November 2019

In many real-world tasks, particularly those involving data objects with complicated semantics such as images and texts, one object can be represented by multiple instances and simultaneously be associated with multiple labels. Such tasks can be formulated as multi-instance multi-label learning (MIML) problems, and have been extensively studied during the past few years. Existing MIML approaches have been found useful in many applications; however, most of them can only handle moderate-sized data.

View Article and Find Full Text PDF

In this paper, we propose a joint conditional graphical Lasso to learn multiple conditional Gaussian graphical models, also known as Gaussian conditional random fields, with some similar structures. Our model builds on the maximum likelihood method with the convex sparse group Lasso penalty. Moreover, our model is able to model multiple multivariate linear regressions with unknown noise covariances via a convex formulation.

View Article and Find Full Text PDF

The wisdom of crowds (WOCs), as a theory in the social science, gets a new paradigm in computer science. The WOC theory explains that the aggregate decision made by a group is often better than those of its individual members if specific conditions are satisfied. This paper presents a novel framework for unsupervised and semisupervised cluster ensemble by exploiting the WOC theory.

View Article and Find Full Text PDF

Automated annotation of protein function is challenging. As the number of sequenced genomes rapidly grows, the vast majority of proteins can only be annotated computationally. Nature often brings several domains together to form multi-domain and multi-functional proteins with a vast number of possibilities, and each domain may fulfill its own function independently or in a concerted manner with its neighbors.

View Article and Find Full Text PDF

Active learning reduces the labeling cost by iteratively selecting the most valuable data to query their labels. It has attracted a lot of interests given the abundance of unlabeled data and the high cost of labeling. Most active learning approaches select either informative or representative unlabeled instances to query their labels, which could significantly limit their performance.

View Article and Find Full Text PDF