Video Object Segmentation Using Kernelized Memory Network With Multiple Kernels.

IEEE Trans Pattern Anal Mach Intell

Published: February 2023

Semi-supervised video object segmentation (VOS) is to predict the segment of a target object in a video when a ground truth segmentation mask for the target is given in the first frame. Recently, space-time memory networks (STM) have received significant attention as a promising approach for semi-supervised VOS. However, an important point has been overlooked in applying STM to VOS: The solution (=STM) is non-local, but the problem (=VOS) is predominantly local. To solve this mismatch between STM and VOS, we propose new VOS networks called kernelized memory network (KMN) and KMN with multiple kernels (KMN ). Our networks conduct not only Query-to-Memory matching but also Memory-to-Query matching. In Memory-to-Query matching, a kernel is employed to reduce the degree of non-localness of the STM. In addition, we present a Hide-and-Seek strategy in pre-training to handle occlusions effectively. The proposed networks surpass the state-of-the-art results on standard benchmarks by a significant margin (+4% in J on DAVIS 2017 test-dev set). The runtimes of our proposed KMN and KMN on DAVIS 2016 validation set are 0.12 and 0.13 seconds per frame, respectively, and the two networks have similar computation times to STM.

Download full-text PDF	Source
http://dx.doi.org/10.1109/TPAMI.2022.3163375	DOI Listing

Publication Analysis

Top Keywords

video object

object segmentation

kernelized memory

memory network

multiple kernels

stm vos

kmn kmn

matching memory-to-query

memory-to-query matching

vos

Similar Publications

Increasing Neural-Based Pedestrian Detectors' Robustness to Adversarial Patch Attacks Using Anomaly Localization.

J Imaging

January 2025

Science and Research Department, Moscow Technical University of Communications and Informatics, 111024 Moscow, Russia.

Olga Ilina Maxim Tereshonok Vadim Ziyadinov

Object detection in images is a fundamental component of many safety-critical systems, such as autonomous driving, video surveillance systems, and robotics. Adversarial patch attacks, being easily implemented in the real world, provide effective counteraction to object detection by state-of-the-art neural-based detectors. It poses a serious danger in various fields of activity.

View Article and Find Full Text PDF

Similar Publications

LittleFaceNet: A Small-Sized Face Recognition Method Based on RetinaFace and AdaFace.

J Imaging

January 2025

School of Artificial Intelligence, Changchun University of Science and Technology, Changchun 130012, China.

Zhengwei Ren Xinyu Liu Jing Xu Yongsheng Zhang Ming Fang

For surveillance video management in university laboratories, issues such as occlusion and low-resolution face capture often arise. Traditional face recognition algorithms are typically static and rely heavily on clear images, resulting in inaccurate recognition for low-resolution, small-sized faces. To address the challenges of occlusion and low-resolution person identification, this paper proposes a new face recognition framework by reconstructing Retinaface-Resnet and combining it with Quality-Adaptive Margin (adaface).

View Article and Find Full Text PDF

Similar Publications

Robotic Versus Sternotomy, Thoracotomy and Video-Thoracoscopy Approaches for Thymoma Resection: A Comparative Analysis of Short-Term Results.

J Pers Med

January 2025

Department of Thoracic Surgery, Sant'Andrea, Hospital, Sapienza University, 00189 Rome, Italy.

Beatrice Trabalza Marinucci Matteo Tiracorrendo Camilla Vanni Fabiana Messa Giorgia Piccioni

. The optimal surgical approach for thymoma resection is still an object of debate. The increasing experience in robotic-assisted thoracic surgery (RATS) has led to the progressive affirmation of this technique as a valid alternative to Sternotomy, Thoracotomy and Video-Assisted Thoracic Surgery (VATS) in this setting.

View Article and Find Full Text PDF

Similar Publications

Real-time detection and monitoring of public littering behavior using deep learning for a sustainable environment.

Sci Rep

January 2025

Computer Science Department, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia.

Eaman Alharbi Ghadah Alsulami Sarah Aljohani Waad Alharbi Somayah Albaradei

With the global population surpassing 8 billion, waste production has skyrocketed, leading to increased pollution that adversely affects both terrestrial and marine ecosystems. Public littering, a significant contributor to this pollution, poses severe threats to marine life due to plastic debris, which can inflict substantial ecological harm. Additionally, this pollution jeopardizes human health through contaminated food and water sources.

View Article and Find Full Text PDF

Similar Publications

"Play by play": A dataset of handball and basketball game situations in a standardized space.

Data Brief

February 2025

Universidade da Coruña, CITIC Research Center, A Coruña 15071, Spain.

Bruno Cabado Bertha Guijarro-Berdiñas Emilio J Padrón

This paper presents a synthetic dataset of labeled game situations in recordings of federated handball and basketball matches played in Galicia, Spain. The dataset consists of synthetic data generated from real video frames, including 308,805 labeled handball frames and 56,578 labeled basketball frames extracted from 2105 handball and 383 basketball 5-s video clips. Experts manually labeled the video clips based on the respective sports, while the individual frames were automatically labeled using computer vision and machine learning techniques.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!