IEEE Trans Pattern Anal Mach Intell
December 2024
Visual categories that largely share the same set of local parts cannot be discriminated based on part information alone, as they mostly differ in the way the local parts relate to the overall global structure of the object. We propose Relational Proxies, a novel approach that leverages the relational information between the global and local views of an object for encoding its semantic label, even for categories it has not encountered during training. Starting with a rigorous formalization of the notion of distinguishability between categories that share attributes, we prove the necessary and sufficient conditions that a model must satisfy in order to learn the underlying decision boundaries to tell them apart.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
March 2024
State-of-the-art deep learning models are often trained with a large amount of costly labeled training data. However, requiring exhaustive manual annotations may degrade the model's generalizability in the limited-label regime.Semi-supervised learning and unsupervised learning offer promising paradigms to learn from an abundance of unlabeled visual data.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
February 2023
Weakly-supervised object localization (WSOL) has gained popularity over the last years for its promise to train localization models with only image-level labels. Since the seminal WSOL work of class activation mapping (CAM), the field has focused on how to expand the attention regions to cover objects more broadly and localize them better. However, these strategies rely on full localization supervision for validating hyperparameters and model selection, which is in principle prohibited under the WSOL setup.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
March 2024
Compositional Zero-Shot learning (CZSL) aims to recognize unseen compositions of state and object visual primitives seen during training. A problem with standard CZSL is the assumption of knowing which unseen compositions will be available at test time. In this work, we overcome this assumption operating on the open world setting, where no limit is imposed on the compositional space at test time, and the search space contains a large number of unseen compositions.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
December 2022
Few-shot learning aims to recognize novel classes from a few examples. Although significant progress has been made in the image domain, few-shot video classification is relatively unexplored. We argue that previous methods underestimate the importance of video feature learning and propose to learn spatiotemporal features using a 3D CNN.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
September 2019
Due to the importance of zero-shot learning, i.e., classifying images where there is a lack of labeled training data, the number of proposed approaches has recently increased steadily.
View Article and Find Full Text PDFAttributes act as intermediate representations that enable parameter sharing between classes, a must when training data is scarce. We propose to view attribute-based image classification as a label-embedding problem: each class is embedded in the space of attribute vectors. We introduce a function that measures the compatibility between an image and a label embedding.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
March 2014
We benchmark several SVM objective functions for large-scale image classification. We consider one-versus-rest, multiclass, ranking, and weighted approximate ranking SVMs. A comparison of online and batch methods for optimizing the objectives shows that online methods perform as well as batch methods in terms of classification accuracy, but with a significant gain in training speed.
View Article and Find Full Text PDF