Semi-supervised video object segmentation (VOS) is to predict the segment of a target object in a video when a ground truth segmentation mask for the target is given in the first frame. Recently, space-time memory networks (STM) have received significant attention as a promising approach for semi-supervised VOS. However, an important point has been overlooked in applying STM to VOS: The solution (=STM) is non-local, but the problem (=VOS) is predominantly local. To solve this mismatch between STM and VOS, we propose new VOS networks called kernelized memory network (KMN) and KMN with multiple kernels (KMN ). Our networks conduct not only Query-to-Memory matching but also Memory-to-Query matching. In Memory-to-Query matching, a kernel is employed to reduce the degree of non-localness of the STM. In addition, we present a Hide-and-Seek strategy in pre-training to handle occlusions effectively. The proposed networks surpass the state-of-the-art results on standard benchmarks by a significant margin (+4% in J on DAVIS 2017 test-dev set). The runtimes of our proposed KMN and KMN on DAVIS 2016 validation set are 0.12 and 0.13 seconds per frame, respectively, and the two networks have similar computation times to STM.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2022.3163375DOI Listing

Publication Analysis

Top Keywords

video object
8
object segmentation
8
kernelized memory
8
memory network
8
multiple kernels
8
stm vos
8
kmn kmn
8
matching memory-to-query
8
memory-to-query matching
8
vos
5

Similar Publications

Object detection in images is a fundamental component of many safety-critical systems, such as autonomous driving, video surveillance systems, and robotics. Adversarial patch attacks, being easily implemented in the real world, provide effective counteraction to object detection by state-of-the-art neural-based detectors. It poses a serious danger in various fields of activity.

View Article and Find Full Text PDF

For surveillance video management in university laboratories, issues such as occlusion and low-resolution face capture often arise. Traditional face recognition algorithms are typically static and rely heavily on clear images, resulting in inaccurate recognition for low-resolution, small-sized faces. To address the challenges of occlusion and low-resolution person identification, this paper proposes a new face recognition framework by reconstructing Retinaface-Resnet and combining it with Quality-Adaptive Margin (adaface).

View Article and Find Full Text PDF

. The optimal surgical approach for thymoma resection is still an object of debate. The increasing experience in robotic-assisted thoracic surgery (RATS) has led to the progressive affirmation of this technique as a valid alternative to Sternotomy, Thoracotomy and Video-Assisted Thoracic Surgery (VATS) in this setting.

View Article and Find Full Text PDF

With the global population surpassing 8 billion, waste production has skyrocketed, leading to increased pollution that adversely affects both terrestrial and marine ecosystems. Public littering, a significant contributor to this pollution, poses severe threats to marine life due to plastic debris, which can inflict substantial ecological harm. Additionally, this pollution jeopardizes human health through contaminated food and water sources.

View Article and Find Full Text PDF

This paper presents a synthetic dataset of labeled game situations in recordings of federated handball and basketball matches played in Galicia, Spain. The dataset consists of synthetic data generated from real video frames, including 308,805 labeled handball frames and 56,578 labeled basketball frames extracted from 2105 handball and 383 basketball 5-s video clips. Experts manually labeled the video clips based on the respective sports, while the individual frames were automatically labeled using computer vision and machine learning techniques.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!