Publications by authors named "Victor S Sheng"

Various correlations hidden in crowdsourcing annotation tasks bring opportunities to further improve the accuracy of label aggregation. However, these relationships are usually extremely difficult to be modeled. Most existing methods can merely make use of one or two correlations.

View Article and Find Full Text PDF

In biochemistry, graph structures have been widely used for modeling compounds, proteins, functional interactions, etc. A common task that divides these graphs into different categories, known as graph classification, highly relies on the quality of the representations of graphs. With the advance in graph neural networks, message-passing-based methods are adopted to iteratively aggregate neighborhood information for better graph representations.

View Article and Find Full Text PDF

Multilabel annotation is a critical step to generate training sets when learning classification models in various application domains, but asking domain experts to provide labels is usually time-consuming and expensive, which cannot meet the current requirement of the fast evolution of the models in the big data era. Although crowdsourcing provides a fast solution to acquire labels for multilabel learning, it faces the risk of high data acquisition cost and low label quality. This article proposes a novel one-coin label-dependent active crowdsourcing (OCLDAC) method to iteratively query noisy labels from crowd workers and learn multilabel classification models.

View Article and Find Full Text PDF

The liver is an irreplaceable organ in the human body, maintaining life activities and metabolism. Malignant tumors of the liver have a high mortality rate at present. Computer-aided segmentation of the liver and tumors has significant effects on clinical diagnosis and treatment.

View Article and Find Full Text PDF

Image classification is a key task in image understanding, and multi-label image classification has become a popular topic in recent years. However, the success of multi-label image classification is closely related to the way of constructing a training set. As active learning aims to construct an effective training set through iteratively selecting the most informative examples to query labels from annotators, it was introduced into multi-label image classification.

View Article and Find Full Text PDF

In this paper, a linguistic steganalysis method based on two-level cascaded convolutional neural networks (CNNs) is proposed to improve the system's ability to detect stego texts, which are generated via synonym substitutions. The first-level network, sentence-level CNN, consists of one convolutional layer with multiple convolutional kernels in different window sizes, one pooling layer to deal with variable sentence lengths, and one fully connected layer with dropout as well as a softmax output, such that two final steganographic features are obtained for each sentence. The unmodified and modified sentences, along with their words, are represented in the form of pre-trained dense word embeddings, which serve as the input of the network.

View Article and Find Full Text PDF

Medical image fusion is important in the field of clinical diagnosis because it can improve the availability of information contained in images. Magnetic Resonance Imaging (MRI) provides excellent anatomical details as well as functional information on regional changes in physiology, hemodynamics, and tissue composition. In contrast, although the spatial resolution of Positron Emission Tomography (PET) provides is lower than that an MRI, PET is capable of depicting the tissue's molecular and pathological activities that are not available from MRI.

View Article and Find Full Text PDF

With online crowdsourcing platforms, labels can be acquired at relatively low costs from massive nonexpert workers. To improve the quality of labels obtained from these imperfect crowdsourced workers, we usually let different workers provide labels for the same instance. Then, the true labels for all instances are estimated from these multiple noisy labels.

View Article and Find Full Text PDF

Parameter in learning problems (usually arising from the tradeoff between training error minimization and regularization) is often tuned by cross validation (CV). A solution path provides a compact representation of all optimal solutions, which can be used to determine the parameter with the global minimum CV error, without solving original optimization problems multiple times based on grid search. However, existing solution path algorithms do not provide a unified implementation to various learning problems.

View Article and Find Full Text PDF

Internal reliability and external safety of Wireless Sensor Networks (WSN) data transmission have become increasingly outstanding issues with the wide applications of WSN. This paper proposes a new method for access control and mitigation of interfering noise in time synchronization environments. First, a formal definition is given regarding the impact interference noise has on the clock skew and clock offset of each node.

View Article and Find Full Text PDF

Machine-constructed knowledge bases often contain noisy and inaccurate facts. There exists significant work in developing automated algorithms for knowledge base refinement. Automated approaches improve the quality of knowledge bases but are far from perfect.

View Article and Find Full Text PDF

Crowdsourcing systems provide a cost effective and convenient way to collect labels, but they often fail to guarantee the quality of the labels. This paper proposes a novel framework that introduces noise correction techniques to further improve the quality of integrated labels that are inferred from the multiple noisy labels of objects. In the proposed general framework, information about the qualities of labelers estimated by a front-end ground truth inference algorithm is utilized to supervise subsequent label noise filtering and correction.

View Article and Find Full Text PDF

For many applications, finding rare instances or outliers can be more interesting than finding common patterns. Existing work in outlier detection never considers the context of deep web. In this paper, we argue that, for many scenarios, it is more meaningful to detect outliers over deep web.

View Article and Find Full Text PDF

Model selection plays an important role in cost-sensitive SVM (CS-SVM). It has been proven that the global minimum cross validation (CV) error can be efficiently computed based on the solution path for one parameter learning problems. However, it is a challenge to obtain the global minimum CV error for CS-SVM based on one-dimensional solution path and traditional grid search, because CS-SVM is with two regularization parameters.

View Article and Find Full Text PDF

Minimax probability machine (MPM) is an interesting discriminative classifier based on generative prior knowledge. It can directly estimate the probabilistic accuracy bound by minimizing the maximum probability of misclassification. The structural information of data is an effective way to represent prior knowledge, and has been found to be vital for designing classifiers in real-world problems.

View Article and Find Full Text PDF

The ν -support vector classification has the advantage of using a regularization parameter ν to control the number of support vectors and margin errors. Recently, a regularization path algorithm for ν -support vector classification ( ν -SvcPath) suffers exceptions and singularities in some special cases. In this brief, we first present a new equivalent dual formulation for ν -SVC and, then, propose a robust ν -SvcPath, based on lower upper decomposition with partial pivoting.

View Article and Find Full Text PDF

The ν-Support Vector Regression (ν-SVR) is an effective regression learning algorithm, which has the advantage of using a parameter ν on controlling the number of support vectors and adjusting the width of the tube automatically. However, compared to ν-Support Vector Classification (ν-SVC) (Schölkopf et al., 2000), ν-SVR introduces an additional linear term into its objective function.

View Article and Find Full Text PDF

To improve the classification performance of imbalanced learning, a novel oversampling method, immune centroids oversampling technique (ICOTE) based on an immune network, is proposed. ICOTE generates a set of immune centroids to broaden the decision regions of the minority class space. The representative immune centroids are regarded as synthetic examples in order to resolve the imbalance problem.

View Article and Find Full Text PDF

Support vector ordinal regression (SVOR) is a popular method to tackle ordinal regression problems. However, until now there were no effective algorithms proposed to address incremental SVOR learning due to the complicated formulations of SVOR. Recently, an interesting accurate on-line algorithm was proposed for training ν -support vector classification (ν-SVC), which can handle a quadratic formulation with a pair of equality constraints.

View Article and Find Full Text PDF

The ν-support vector machine ( ν-SVM) for classification has the advantage of using a parameter ν on controlling the number of support vectors and margin errors. Recently, an interesting accurate on-line algorithm accurate on-line ν-SVM algorithm (AONSVM) is proposed for training ν-SVM. AONSVM can be viewed as a special case of parametric quadratic programming techniques.

View Article and Find Full Text PDF

A motion trajectory is an intuitive representation form in time-space domain for a micromotion behavior of moving target. Trajectory analysis is an important approach to recognize abnormal behaviors of moving targets. Against the complexity of vehicle trajectories, this paper first proposed a trajectory pattern learning method based on dynamic time warping (DTW) and spectral clustering.

View Article and Find Full Text PDF