IEEE Trans Pattern Anal Mach Intell
October 2024
Achieving generalization for deep learning models has usually suffered from the bottleneck of annotated sample scarcity. As a common way of tackling this issue, few-shot learning focuses on "episodes", i.e.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
December 2024
Reconstructing a 3D shape based on a single sketch image is challenging due to the inherent sparsity and ambiguity present in sketches. Existing methods lose fine details when extracting features to predict 3D objects from sketches. Upon analyzing the 3D-to-2D projection process, we observe that the density map, characterizing the distribution of 2D point clouds, can serve as a proxy to facilitate the reconstruction process.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
September 2024
The main challenge for fine-grained few-shot image classification is to learn feature representations with higher inter-class and lower intra-class variations, with a mere few labelled samples. Conventional few-shot learning methods however cannot be naively adopted for this fine-grained setting - a quick pilot study reveals that they in fact push for the opposite (i.e.
View Article and Find Full Text PDFThe problem of sketch semantic segmentation is far from being solved. Despite existing methods exhibiting near-saturating performances on simple sketches with high recognisability, they suffer serious setbacks when the target sketches are products of an imaginative process with high degree of creativity. We hypothesise that human creativity, being highly individualistic, induces a significant shift in distribution of sketches, leading to poor model generalisation.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
May 2024
Despite great strides made on fine-grained visual classification (FGVC), current methods are still heavily reliant on fully-supervised paradigms where ample expert labels are called for. Semi-supervised learning (SSL) techniques, acquiring knowledge from unlabeled data, provide a considerable means forward and have shown great promise for coarse-grained problems. However, exiting SSL paradigms mostly assume in-category (i.
View Article and Find Full Text PDFIEEE Trans Image Process
August 2023
Sketch is a well-researched topic in the vision community by now. Sketch semantic segmentation in particular, serves as a fundamental step towards finer-level sketch interpretation. Recent works use various means of extracting discriminative features from sketches and have achieved considerable improvements on segmentation accuracy.
View Article and Find Full Text PDFIEEE Trans Image Process
August 2023
Source-Free Domain Adaptation (SFDA) is becoming topical to address the challenge of distribution shift between training and deployment data, while also relaxing the requirement of source data availability during target domain adaptation. In this paper, we focus on SFDA for semantic segmentation, in which pseudo labeling based target domain self-training is a common solution. However, pseudo labels generated by the source models are particularly unreliable on the target domain data due to the domain shift issue.
View Article and Find Full Text PDFIEEE Trans Image Process
June 2023
Generalized Few-shot Semantic Segmentation (GFSS) aims to segment each image pixel into either base classes with abundant training examples or novel classes with only a handful of (e. g., 1-5) training images per class.
View Article and Find Full Text PDFAs powerful as fine-grained visual classification (FGVC) is, responding your query with a bird name of "Whip-poor-will" or "Mallard" probably does not make much sense. This however commonly accepted in the literature, underlines a fundamental question interfacing AI and human - what constitutes transferable knowledge for human to learn from AI? This paper sets out to answer this very question using FGVC as a test bed. Specifically, we envisage a scenario where a trained FGVC model (the AI expert) functions as a knowledge provider in enabling average people (you and me) to become better domain experts ourselves.
View Article and Find Full Text PDFMost existing studies on unsupervised domain adaptation (UDA) assume that each domain's training samples come with domain labels (e.g., painting, photo).
View Article and Find Full Text PDFWe present the first one-shot personalized sketch segmentation method. We aim to segment all sketches belonging to the same category provisioned with a single sketch with a given part annotation while (i) preserving the parts semantics embedded in the exemplar, and (ii) being robust to input style and abstraction. We refer to this scenario as personalized.
View Article and Find Full Text PDFFree-hand sketches are highly illustrative, and have been widely used by humans to depict objects or stories from ancient times to the present. The recent prevalence of touchscreen devices has made sketch creation a much easier task than ever and consequently made sketch-oriented applications increasingly popular. The progress of deep learning has immensely benefited free-hand sketch research and applications.
View Article and Find Full Text PDFIEEE Trans Image Process
January 2022
In this paper, we aim to explore the fine-grained perception ability of deep models for the newly proposed scene sketch semantic segmentation task. Scene sketches are abstract drawings containing multiple related objects. It plays a vital role in daily communication and human-computer interaction.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
December 2022
Fine-grained visual classification (FGVC) is much more challenging than traditional classification tasks due to the inherently subtle intra-class object variations. Recent works are mainly part-driven (either explicitly or implicitly), with the assumption that fine-grained information naturally rests within the parts. In this paper, we take a different stance, and show that part operations are not strictly necessary - the key lies with encouraging the network to learn at different granularities and progressively fusing multi-granularity features together.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
December 2022
Fine-grained image analysis (FGIA) is a longstanding and fundamental problem in computer vision and pattern recognition, and underpins a diverse set of real-world applications. The task of FGIA targets analyzing visual objects from subordinate categories, e.g.
View Article and Find Full Text PDFIn this paper we study, for the first time, the problem of fine-grained sketch-based 3D shape retrieval. We advocate the use of sketches as a fine-grained input modality to retrieve 3D shapes at instance-level - e.g.
View Article and Find Full Text PDFGiven pixel-level annotated data, traditional photo segmentation techniques have achieved promising results. However, these photo segmentation models can only identify objects in categories for which data annotation and training have been carried out. This limitation has inspired recent work on few-shot and zero-shot learning for image segmentation.
View Article and Find Full Text PDFIEEE Trans Image Process
February 2020
The key to solving fine-grained image categorization is finding discriminate and local regions that correspond to subtle visual traits. Great strides have been made, with complex networks designed specifically to learn part-level discriminate feature representations. In this paper, we show that it is possible to cultivate subtle details without the need for overly complicated network designs or training mechanisms - a single loss is all it takes.
View Article and Find Full Text PDFHuman free-hand sketches provide the useful data for studying human perceptual grouping, where the grouping principles such as the Gestalt laws of grouping are naturally in play during both the perception and sketching stages. In this paper, we make the first attempt to develop a universal sketch perceptual grouper. That is, a grouper that can be applied to sketches of any category created with any drawing style and ability, to group constituent strokes/segments into semantically meaningful object parts.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
February 2019
In this paper, we develop a novel variational Bayesian learning method for the Dirichlet process (DP) mixture of the inverted Dirichlet distributions, which has been shown to be very flexible for modeling vectors with positive elements. The recently proposed extended variational inference (EVI) framework is adopted to derive an analytically tractable solution. The convergency of the proposed algorithm is theoretically guaranteed by introducing single lower bound approximation to the original objective function in the EVI framework.
View Article and Find Full Text PDFIEEE Trans Image Process
December 2017
We study the problem of fine-grained sketch-based image retrieval. By performing instance-level (rather than category-level) retrieval, it embodies a timely and practical application, particularly with the ubiquitous availability of touchscreens. Three factors contribute to the challenging nature of the problem: 1) free-hand sketches are inherently abstract and iconic, making visual comparisons with photos difficult; 2) sketches and photos are in two different visual domains, i.
View Article and Find Full Text PDFThis paper shows that classifying shapes is a tool useful in nonphotorealistic rendering (NPR) from photographs. Our classifier inputs regions from an image segmentation hierarchy and outputs the "best" fitting simple shape such as a circle, square, or triangle. Other approaches to NPR have recognized the benefits of segmentation, but none have classified the shape of segments.
View Article and Find Full Text PDFFinding meaningful groupings of image primitives has been a long-standing problem in computer vision. This paper studies how salient groupings can be produced using established theories in the field of visual perception alone. The major contribution is a novel definition of the Gestalt principle of Prägnanz, based upon Koffka's definition that image descriptions should be both stable and simple.
View Article and Find Full Text PDFIEEE Trans Vis Comput Graph
September 2007
Abstract-We introduce a simple but versatile camera model that we call the Rational Tensor Camera (RTcam). RTcams are well principled mathematically and provably subsume several important contemporary camera models in both computer graphics and vision; their generality is one contribution. They can be used alone or compounded to produce more complicated visual effects.
View Article and Find Full Text PDF