IEEE Trans Neural Netw Learn Syst
September 2024
Weakly supervised group activity recognition (WSGAR) aims at identifying the overall behavior of multiple persons without any fine-grained supervision information (including individual position and action label). Traditional methods usually adopt a person-to-whole way: detect persons via off-the-shelf detectors, obtain person-level features, and integrate into the group-level features for training the classifier. However, these methods are unflexible due to serious reliance on the quality of detectors.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
July 2024
One fundamental problem in deep learning is understanding the excellent performance of deep Neural Networks (NNs) in practice. An explanation for the superiority of NNs is that they can realize a large family of complicated functions, i.e.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
November 2023
Zero-shot learning (ZSL) tackles the novel class recognition problem by transferring semantic knowledge from seen classes to unseen ones. Semantic knowledge is typically represented by attribute descriptions shared between different classes, which act as strong priors for localizing object attributes that represent discriminative region features, enabling significant and sufficient visual-semantic interaction for advancing ZSL. Existing attention-based models have struggled to learn inferior region features in a single image by solely using unidirectional attention, which ignore the transferable and discriminative attribute localization of visual features for representing the key semantic knowledge for effective knowledge transfer in ZSL.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
November 2023
Generative (generalized) zero-shot learning [(G)ZSL] models aim to synthesize unseen class features by using only seen class feature and attribute pairs as training data. However, the generated fake unseen features tend to be dominated by the seen class features and thus classified as seen classes, which can lead to inferior performances under zero-shot learning (ZSL), and unbalanced results under generalized ZSL (GZSL). To address this challenge, we tailor a novel balanced semantic embedding generative network (BSeGN), which incorporates balanced semantic embedding learning into generative learning scenarios in the pursuit of unbiased GZSL.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
April 2024
Recently, deep metric learning (DML) has achieved great success. Some existing DML methods propose adaptive sample mining strategies, which learn to weight the samples, leading to interesting performance. However, these methods suffer from a small memory (e.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
April 2024
Zero-shot learning (ZSL) tackles the unseen class recognition problem by transferring semantic knowledge from seen classes to unseen ones. Typically, to guarantee desirable knowledge transfer, a direct embedding is adopted for associating the visual and semantic domains in ZSL. However, most existing ZSL methods focus on learning the embedding from implicit global features or image regions to the semantic space.
View Article and Find Full Text PDFIEEE J Biomed Health Inform
November 2022
Accurate medical image segmentation of brain tumors is necessary for the diagnosing, monitoring, and treating disease. In recent years, with the gradual emergence of multi-sequence magnetic resonance imaging (MRI), multi-modal MRI diagnosis has played an increasingly important role in the early diagnosis of brain tumors by providing complementary information for a given lesion. Different MRI modalities vary significantly in context, as well as in coarse and fine information.
View Article and Find Full Text PDFIEEE Trans Image Process
April 2021
Transductive zero-shot learning (TZSL) extends conventional ZSL by leveraging (unlabeled) unseen images for model training. A typical method for ZSL involves learning embedding weights from the feature space to the semantic space. However, the learned weights in most existing methods are dominated by seen images, and can thus not be adapted to unseen images very well.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
July 2022
Generative adversarial networks (GANs) for (generalized) zero-shot learning (ZSL) aim to generate unseen image features when conditioned on unseen class embeddings, each of which corresponds to one unique category. Most existing works on GANs for ZSL generate features by merely feeding the seen image feature/class embedding (combined with random Gaussian noise) pairs into the generator/discriminator for a two-player minimax game. However, the structure consistency of the distributions among the real/fake image features, which may shift the generated features away from their real distribution to some extent, is seldom considered.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
July 2020
Studies present that dividing categories into subcategories contributes to better image classification. Existing image subcategorization works relying on expert knowledge and labeled images are both time-consuming and labor-intensive. In this article, we propose to select and subsequently classify images into categories and subcategories.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
October 2020
Feature representation learning, an emerging topic in recent years, has achieved great progress. Powerful learned features can lead to excellent classification accuracy. In this article, a selective and robust feature representation framework with a supervised constraint (SRSC) is presented.
View Article and Find Full Text PDFMedical prediction is always collectively determined based on bioimages collected from different sources or various clinical characterizations described from multiple physiological features. Notably, learning intrinsic structures from multiple heterogeneous features is significant but challenging in multi-view disease understanding. Different from existing methods that separately deal with each single view, this paper proposes a discriminative Margin-Sensitive Autoencoder (MSAE) framework for automated Alzheimer's disease (AD) diagnosis and accurate protein fold recognition.
View Article and Find Full Text PDFIEEE Trans Image Process
October 2019
Compact hash code learning has been widely applied to fast similarity search owing to its significantly reduced storage and highly efficient query speed. However, it is still a challenging task to learn discriminative binary codes for perfectly preserving the full pairwise similarities embedded in the high-dimensional real-valued features, such that the promising performance can be guaranteed. To overcome this difficulty, in this paper, we propose a novel scalable supervised asymmetric hashing (SSAH) method, which can skillfully approximate the full-pairwise similarity matrix based on maximum asymmetric inner product of two different non-binary embeddings.
View Article and Find Full Text PDFAiming to implement image segmentation precisely and efficiently, we exploit new ways to encode images and achieve the optimal thresholding on quantum state space. Firstly, the state vector and density matrix are adopted for the representation of pixel intensities and their probability distribution, respectively. Then, the method based on global quantum entropy maximization (GQEM) is proposed, which has an equivalent object function to Otsu's, but gives a more explicit physical interpretation of image thresholding in the language of quantum mechanics.
View Article and Find Full Text PDFIn the original publication of the article, the y axis labels present in Figs. 1a and 2a are incorrect. The correct Figs.
View Article and Find Full Text PDFIn this paper, we propose two four-base related 2D curves of DNA primary sequences (termed as F-B curves) and their corresponding single-base related 2D curves (termed as A-related, G-related, T-related and C-related curves). The constructions of these graphical curves are based on the assignments of individual base to four different sinusoidal (or tangent) functions; then by connecting all these points on these four sinusoidal (tangent) functions, we can get the F-B curves; similarly, by connecting the points on each of the four sinusoidal (tangent) functions, we get the single-base related 2D curves. The proposed 2D curves are all strictly non degenerate.
View Article and Find Full Text PDFIn this paper, we aim at learning compact and discriminative linear regression models. Linear regression has been widely used in different problems. However, most of the existing linear regression methods exploit the conventional zero-one matrix as the regression targets, which greatly narrows the flexibility of the regression model.
View Article and Find Full Text PDFIn this article, we introduce three 3D graphical representations of DNA primary sequences, which we call RY-curve, MK-curve and SW-curve, based on three classifications of the DNA bases. The advantages of our representations are that (i) these 3D curves are strictly non-degenerate and there is no loss of information when transferring a DNA sequence to its mathematical representation and (ii) the coordinates of every node on these 3D curves have clear biological implication. Two applications of these 3D curves are presented: (a) a simple formula is derived to calculate the content of the four bases (A, G, C and T) from the coordinates of nodes on the curves; and (b) a 12-component characteristic vector is constructed to compare similarity among DNA sequences from different species based on the geometrical centers of the 3D curves.
View Article and Find Full Text PDF