Text-based video segmentation aims to segment an actor in video sequences by specifying the actor and its performing action with a textual query. Previous methods fail to explicitly align the video content with the textual query in a fine-grained manner according to the actor and its action, due to the problem of semantic asymmetry. The semantic asymmetry implies that two modalities contain different amounts of semantic information during the multi-modal fusion process. To alleviate this problem, we propose a novel actor and action modular network that individually localizes the actor and its action in two separate modules. Specifically, we first learn the actor-/action-related content from the video and textual query, and then match them in a symmetrical manner to localize the target tube. The target tube contains the desired actor and action which is then fed into a fully convolutional network to predict segmentation masks of the actor. Our method also establishes the association of objects cross multiple frames with the proposed temporal proposal aggregation mechanism. This enables our method to segment the video effectively and keep the temporal consistency of predictions. The whole model is allowed for joint learning of the actor-action matching and segmentation, as well as achieves the state-of-the-art performance for both single-frame segmentation and full video segmentation on A2D Sentences and J-HMDB Sentences datasets.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2022.3185487DOI Listing

Publication Analysis

Top Keywords

actor action
20
video segmentation
12
textual query
12
actor
8
action modular
8
modular network
8
text-based video
8
semantic asymmetry
8
target tube
8
video
7

Similar Publications

Government and community strategies in Pernambuco, Brazil, to face COVID-19.

Cien Saude Colet

December 2024

Laboratório de Avaliação de Situações Endêmicas Regionais, Departamento de Endemias Samuel Pessoa, Escola Nacional de Saúde Pública Sérgio Arouca, Fundação Oswaldo Cruz (Fiocruz). R. Leopoldo Bulhões 1480, Manguinhos. 21041-210 Rio de Janeiro RJ Brasil.

This case study analyzed arrangements and strategies of the network actors in the Special Indigenous Sanitary District (DSEI) Pernambuco's territory to guarantee the right to health of Indigenous populations during the COVID-19 pandemic. This work was carried out through document analysis, workshops, and field research. The Contingency Plan for COVID-19 in Indigenous Peoples of DSEI Pernambuco included surveillance actions, laboratory and pharmaceutical assistance, communication, and management.

View Article and Find Full Text PDF

The shift to pass/fail grading in undergraduate medical education was designed to reduce medical students' stress. However, this change has given rise to a "shadow economy of effort," as students move away from traditional didactic and clinical learning to engage in increasing numbers of research, volunteer, and work experiences to enhance their residency applications. These extracurricular efforts to secure a residency position are sub-phenomena of the hidden curriculum.

View Article and Find Full Text PDF

MACRPO: Multi-agent cooperative recurrent policy optimization.

Front Robot AI

December 2024

Intelligent Robotics Group, Electrical Engineering and Automation Department, Aalto University, Helsinki, Finland.

This work considers the problem of learning cooperative policies in multi-agent settings with partially observable and non-stationary environments without a communication channel. We focus on improving information sharing between agents and propose a new multi-agent actor-critic method called (MACRPO). We propose two novel ways of integrating information across agents and time in MACRPO: First, we use a recurrent layer in the critic's network architecture and propose a new framework to use the proposed meta-trajectory to train the recurrent layer.

View Article and Find Full Text PDF

The current research introduces a model-free ultra-local model (MFULM) controller that utilizes the multi-agent on-policy reinforcement learning (MAOPRL) technique for remotely regulating blood pressure through precise drug dosing in a closed-loop system. Within the closed-loop system, there exists a MFULM controller, an observer, and an intelligent MAOPRL algorithm. Initially, a flexible MFULM controller is created to make adjustments to blood pressure and medication dosages.

View Article and Find Full Text PDF

In the last decade, there has been a push for greater evidence-based practice within the humanitarian sector, alongside an increasing turn towards localising humanitarian assistance. Humanitarian actors and organisations have been increasing their production and use of evidence, while also being encouraged to reflect more critically on power hierarchies and decolonise humanitarian aid. This paper explores the intersection of these two narratives, examining how the use of evidence in humanitarian decision-making fits within a localisation agenda.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!