Camouflaged object detection (COD) and salient object detection (SOD) are two distinct yet closely-related computer vision tasks widely studied during the past decades. Though sharing the same purpose of segmenting an image into binary foreground and background regions, their distinction lies in the fact that COD focuses on concealed objects hidden in the image, while SOD concentrates on the most prominent objects in the image. Building universal segmentation models is currently a hot topic in the community. Previous works achieved good performance on certain task by stacking various hand-designed modules and multi-scale features. However, these careful task-specific designs also make them lose their potential as general-purpose architectures. Therefore, we hope to build general architectures that can be applied to both tasks. In this work, we propose a simple yet effective network (SENet) based on vision Transformer (ViT), by employing a simple design of an asymmetric ViT-based encoder-decoder structure, we yield competitive results on both tasks, exhibiting greater versatility than meticulously crafted ones. To enhance the performance of universal architectures on both tasks, we propose some general methods targeting some common difficulties of the two tasks. First, we use image reconstruction as an auxiliary task during training to increase the difficulty of training, forcing the network to have a better perception of the image as a whole to help with segmentation tasks. In addition, we propose a local information capture module (LICM) to make up for the limitations of the patch-level attention mechanism in pixel-level COD and SOD tasks and a dynamic weighted loss (DW loss) to solve the problem that small target samples are more difficult to locate and segment in both tasks. Finally, we also conduct a preliminary exploration of joint training, trying to use one model to complete two tasks simultaneously. Extensive experiments on multiple benchmark datasets demonstrate the effectiveness of our method. The code is available at https://github.com/linuxsino/SENet.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2025.3528347DOI Listing

Publication Analysis

Top Keywords

object detection
12
tasks
9
simple effective
8
effective network
8
based vision
8
vision transformer
8
camouflaged object
8
salient object
8
image
5
network based
4

Similar Publications

The ability of microbial active motion, morphology, and optical properties to serve as biosignatures was investigated by in situ video microscopy in a wide range of extreme field sites where such imaging had not been performed previously. These sites allowed for sampling seawater, sea ice brines, cryopeg brines, hypersaline pools and seeps, hyperalkaline springs, and glaciovolcanic cave ice. In all samples except the cryopeg brine, active motion was observed without any sample treatment.

View Article and Find Full Text PDF

In Augmented Reality (AR), virtual content enhances user experience by providing additional information. However, improperly positioned or designed virtual content can be detrimental to task performance, as it can impair users' ability to accurately interpret real-world information. In this paper we examine two types of task-detrimental virtual content: obstruction attacks, in which virtual content prevents users from seeing real-world objects, and information manipulation attacks, in which virtual content interferes with users' ability to accurately interpret real-world information.

View Article and Find Full Text PDF

Rotor attitude detection (RAD) is one of the key technologies to control permanent magnet spherical motors (PMSpM). This paper proposes an improved you only look once v8n (YOLOv8n) based RAD method for a PMSpM. The visual image datasets collection and annotation method are described, and three different visual feature objects are set for the RAD.

View Article and Find Full Text PDF

Objective: To analyze the effects of multiplane reconstruction (MPR) technology with multi-slice spiral CT (MSCT) in the etiological diagnosis of acute intestinal obstruction (AIO). Obtaining clear images is of great help in determining the type and etiology of AIO, and doctors can quickly develop treatment plans to improve prognosis and efficacy.

Methods: The clinical data of patients with suspected AIO admitted to our hospital from May 2020 to May 2022 were retrospectively selected as the observation objects.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!