In recent years, the field of object detection has made significant progress. The success of most of the state-of-the-art object detectors is derived from the use of feature pyramid and the carefully designed anchor boxes. However, the current methods of constructing feature pyramid usually blindly integrate multi-scale representations on each feature hierarchy. Furthermore, these detectors also suffer from some drawbacks brought by the hand-designed anchors. To mitigate the adverse effects caused thereby, we introduce a one-stage object detector, named as the semi-anchor-free network with enhanced feature pyramid (SAFNet). Specifically, to better construct feature pyramid, we propose a novel enhanced feature pyramid generation paradigm, which mainly consists of two modules, i.e., adaptive feature fusion module (AFFM) and self-enhanced module (SEM). The paradigm adaptively integrates multi-scale representations in a non-linear method meanwhile suppress the redundant semantic information for each pyramid level, such that a clean and enhanced feature pyramid could be obtained. In addition, an adaptive anchor generator (AAG) is designed to yield fewer but more suitable anchor boxes for each input image. Benefiting from the enhanced feature pyramid, AAG is capable of generating more accurate anchor boxes by introducing few priors. Thus, AAG has the ability to alleviate the drawbacks caused by the preset anchor hyper-parameters and helps to decrease the computation cost. Extensive experiments demonstrate the effectiveness of our approach. Profited from the proposed modules, SAFNet significantly boosts the detection performance, i.e., achieving 2 points and 2.1 points higher Average Precision (AP) than RetinaNet (our baseline) on PASCAL VOC and MS COCO respectively. Codes will be publicly available soon.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2020.3028196DOI Listing

Publication Analysis

Top Keywords

feature pyramid
32
enhanced feature
20
anchor boxes
12
feature
10
pyramid
9
semi-anchor-free network
8
network enhanced
8
object detection
8
multi-scale representations
8
enhanced
5

Similar Publications

Convolutional Neural Networks (CNNs) have achieved remarkable segmentation accuracy in medical image segmentation tasks. However, the Vision Transformer (ViT) model, with its capability of extracting global information, offers a significant advantage in contextual information compared to the limited receptive field of convolutional kernels in CNNs. Despite this, ViT models struggle to fully detect and extract high-frequency signals, such as textures and boundaries, in medical images.

View Article and Find Full Text PDF

Erdheim Chester Disease with Calvarial Involvement: A rare case of Histiocytosis.

Turk Neurosurg

March 2024

SBÜ Gaziosmanpaşa Eğitim ve Araştırma Hastanesi.

Erdheim-Chester Disease is a rare systemic xanthogranulomatous infiltrating disease, characterized by lipid-laden histiocytes accumulating in various organs and almost always in bones. Etiology of the disease is still unknown. It may involve various organs and systems, such as musculoskeletal, cardiac, pulmonary, renal, gastrointestinal and central nervous system (CNS) as well as the skin.

View Article and Find Full Text PDF

Background: Cervical cancer remains a critical global health issue, responsible for over 600,000 new cases and 300,000 deaths annually. Pathological imaging of cervical cancer is a crucial diagnostic tool. However, distinguishing specific areas of cellular differentiation remains challenging because of the lack of clear boundaries between cells at various stages of differentiation.

View Article and Find Full Text PDF

Seg-SkiNet: adaptive deformable fusion convolutional network for skin lesion segmentation.

Quant Imaging Med Surg

January 2025

School of Computer and Control Engineering, Yantai University, Yantai, China.

Background: Skin lesion segmentation plays a significant role in skin cancer diagnosis. However, due to the complex shapes, varying sizes, and different color depths, precise segmentation of skin lesions is a challenging task. Therefore, the aim of this study was to design a customized deep learning (DL) model for the precise segmentation of skin lesions, particularly for complex shapes and small target lesions.

View Article and Find Full Text PDF

Coordination Chemistry and Photoluminescence of Sm(II) Dibenzo-24-crown-8 Complexes.

Inorg Chem

January 2025

Department of Chemistry and Nuclear Science & Engineering Center, Colorado School of Mines, Golden, Colorado 80401, United States.

Three Sm(II) dibenzo-24-crown-8 (db24c8) complexes were synthesized in anhydrous, air-free conditions via the reaction of SmI with db24c8 and tetrabutylammonium tetraphenylborate ([TBA][BPh]; where needed) in acetonitrile (CHCN), dimethoxyethane (DME), and tetrahydrofuran (THF) to yield [Sm(db24c8)(CHCN)][BPh][I]·CHCN, [Sm(db24c8)(DME)]I, and [Sm(db24c8)(THF)]I, respectively. In each case, a 10-coordinate, staggered dodecahedral (2:6:2) environment is formed around the Sm center that is completed by either two solvent molecules (CHCN or THF) or one bidentate solvent molecule (DME). Inner-sphere solvent molecules can be excluded by reacting SmI with db24c8 in 1:3 THF:toluene to yield Sm(db24c8)I.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!