Few-shot object detection (FSOD) identifies objects from extremely few annotated samples. Most existing FSOD methods, recently, apply the two-stage learning paradigm, which transfers the knowledge learned from abundant base classes to assist the few-shot detectors by learning the global features. However, such existing FSOD approaches seldom consider the localization of objects from local to global. Limited by the scarce training data in FSOD, the training samples of novel classes typically capture part of objects, resulting in such FSOD methods being unable to detect the completely unseen object during testing. To tackle this problem, we propose an Extensible Co-Existing Attention (ECEA) module to enable the model to infer the global object according to the local parts. Specifically, we first devise an extensible attention mechanism that starts with a local region and extends attention to co-existing regions that are similar and adjacent to the given local region. We then implement the extensible attention mechanism in different feature scales to progressively discover the full object in various receptive fields. In the training process, the model learns the extensible ability on the base stage with abundant samples and transfers it to the novel stage of continuous extensible learning, which can assist the few-shot model to quickly adapt in extending local regions to co-existing regions. Extensive experiments on the PASCAL VOC and COCO datasets show that our ECEA module can assist the few-shot detector to completely predict the object despite some regions failing to appear in the training samples and achieve the new state-of-the-art compared with existing FSOD methods. Code is released at https://github.com/zhimengXin/ECEA.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2024.3411771DOI Listing

Publication Analysis

Top Keywords

existing fsod
12
fsod methods
12
assist few-shot
12
extensible co-existing
8
co-existing attention
8
few-shot object
8
object detection
8
training samples
8
ecea module
8
extensible attention
8

Similar Publications

Article Synopsis
  • Fetal multi-anatomical structure detection in ultrasound images helps sonographers make more accurate diagnoses by revealing the relationships between anatomical structures.
  • Deep learning has improved the detection of these structures, but challenges remain when dealing with rare diseases due to data scarcity, prompting interest in few-shot learning techniques.
  • The proposed TKR-FSOD method enhances fetal anatomical structure detection using a Topological Knowledge Reasoning Module and a Discriminate Ability Enhanced Feature Learning Module, outpacing existing methods with a significant performance improvement.
View Article and Find Full Text PDF

Improved region proposal network for enhanced few-shot object detection.

Neural Netw

December 2024

Department of Computer Science, University of Southern California, 3650 McClintock Avenue, Los Angeles, 90089, CA, USA. Electronic address:

Despite significant success of deep learning in object detection tasks, the standard training of deep neural networks requires access to a substantial quantity of annotated images across all classes. Data annotation is an arduous and time-consuming endeavor, particularly when dealing with infrequent objects. Few-shot object detection (FSOD) methods have emerged as a solution to the limitations of classic object detection approaches based on deep learning.

View Article and Find Full Text PDF

Few-shot object detection (FSOD) identifies objects from extremely few annotated samples. Most existing FSOD methods, recently, apply the two-stage learning paradigm, which transfers the knowledge learned from abundant base classes to assist the few-shot detectors by learning the global features. However, such existing FSOD approaches seldom consider the localization of objects from local to global.

View Article and Find Full Text PDF

Few-shot object detection (FSOD), which detects novel objects with only a few training instances, has recently attracted more attention. Previous works focus on making the most use of label information of objects. Still, they fail to consider the structural and semantic information of the image itself and solve the misclassification between data-abundant base classes and data-scarce novel classes efficiently.

View Article and Find Full Text PDF
Article Synopsis
  • Few-shot object detection (FSOD) focuses on developing a detector that can quickly adapt to new tasks with limited data, but existing methods often struggle with efficiency, particularly in terms of speed and computational demands.
  • The authors introduce a new efficient pretrain-transfer framework (PTF) that maintains performance similar to state-of-the-art methods while not increasing computational costs, and they enhance the framework with a knowledge inheritance (KI) initializer to speed up adaptation.
  • Their approach demonstrates significant improvements in adaptation speed (1.8-100× faster) on public benchmarks like PASCAL VOC, COCO, and LVIS, marking a novel effort to address efficiency in FSOD.
View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!