For more efficient generalization to unseen domains (classes), most Few-shot Segmentation (FSS) would directly exploit pre-trained encoders and only fine-tune the decoder, especially in the current era of large models. However, such fixed feature encoders tend to be class-agnostic, inevitably activating objects that are irrelevant to the target class. In contrast, humans can effortlessly focus on specific objects in the line of sight. This paper mimics the visual perception pattern of human beings and proposes a novel and powerful prompt-driven scheme, called "Prompt and Transfer" (PAT), which constructs a dynamic class-aware prompting paradigm to tune the encoder for focusing on the interested object (target class) in the current task. Three key points are elaborated to enhance the prompting: 1) Cross-modal linguistic information is introduced to initialize prompts for each task. 2) Semantic Prompt Transfer (SPT) that precisely transfers the class-specific semantics within the images to prompts. 3) Part Mask Generator (PMG) that works in conjunction with SPT to adaptively generate different but complementary part prompts for different individuals. Surprisingly, PAT achieves competitive performance on 4 different tasks including standard FSS, Cross-domain FSS (e.g., CV, medical, and remote sensing domains), Weak-label FSS, and Zero-shot Segmentation, setting new state-of-the-arts on 11 benchmarks.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2024.3461779DOI Listing

Publication Analysis

Top Keywords

dynamic class-aware
8
few-shot segmentation
8
target class
8
prompt-and-transfer dynamic
4
class-aware enhancement
4
enhancement few-shot
4
segmentation efficient
4
efficient generalization
4
generalization unseen
4
unseen domains
4

Similar Publications

For more efficient generalization to unseen domains (classes), most Few-shot Segmentation (FSS) would directly exploit pre-trained encoders and only fine-tune the decoder, especially in the current era of large models. However, such fixed feature encoders tend to be class-agnostic, inevitably activating objects that are irrelevant to the target class. In contrast, humans can effortlessly focus on specific objects in the line of sight.

View Article and Find Full Text PDF

In leveraging manifold learning in domain adaptation (DA), graph embedding-based DA methods have shown their effectiveness in preserving data manifold through the Laplace graph. However, current graph embedding DA methods suffer from two issues: 1). they are only concerned with preservation of the underlying data structures in the embedding and ignore sub-domain adaptation, which requires taking into account intra-class similarity and inter-class dissimilarity, thereby leading to negative transfer; 2).

View Article and Find Full Text PDF

It is vital to develop general models that can be shared across subjects and sessions in the real-world deployment of electroencephalogram (EEG) emotion recognition systems. Many prior studies have exploited domain adaptation algorithms to alleviate the inter-subject and inter-session discrepancies of EEG distributions. However, these methods only aligned the global domain divergence, but overlooked the local domain divergence with respect to each emotion category.

View Article and Find Full Text PDF

Unsupervised domain selective graph convolutional network for preoperative prediction of lymph node metastasis in gastric cancer.

Med Image Anal

July 2022

National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, Marshall Laboratory of Biomedical Engineering, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen 518060, China. Electronic address:

Preoperative prediction of lymph node (LN) metastasis based on computed tomography (CT) scans is an important task in gastric cancer, but few machine learning-based techniques have been proposed. While multi-center datasets increase sample size and representation ability, they suffer from inter-center heterogeneity. To tackle the above issue, we propose a novel multi-source domain adaptation framework for this diagnosis task, which not only considers domain-invariant and domain-specific features, but also achieves the imbalanced knowledge transfer and class-aware feature alignment across domains.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!