Due to the continuous booming of surveillance and Web videos, video moment localization, as an important branch of video content analysis, has attracted wide attention from both industry and academia in recent years. It is, however, a non-trivial task due to the following challenges: temporal context modeling, intelligent moment candidate generation, as well as the necessary efficiency and scalability in practice. To address these impediments, we present a deep end-to-end cross-modal hashing network. To be specific, we first design a video encoder relying on a bidirectional temporal convolutional network to simultaneously generate moment candidates and learn their representations. Considering that the video encoder characterizes temporal contextual structures at multiple scales of time windows, we can thus obtain enhanced moment representations. As a counterpart, we design an independent query encoder towards user intention understanding. Thereafter, a cross-model hashing module is developed to project these two heterogeneous representations into a shared isomorphic Hamming space for compact hash code learning. After that, we can effectively estimate the relevance score of each "moment-query" pair via the Hamming distance. Besides effectiveness, our model is far more efficient and scalable since the hash codes of videos can be learned offline. Experimental results on real-world datasets have justified the superiority of our model over several state-of-the-art competitors.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2021.3073867DOI Listing

Publication Analysis

Top Keywords

video moment
8
moment localization
8
cross-modal hashing
8
video encoder
8
video
5
localization deep
4
deep cross-modal
4
hashing continuous
4
continuous booming
4
booming surveillance
4

Similar Publications

Objectives: We aim to quantify the performance of accelerometry in objectively measuring physical activity (PA) intensity among infants and toddlers.

Methods: Thirty-eight 6- to 24-month-olds participated in a 30-min, semistructured lab visit. Twenty-three (61%) children could walk independently.

View Article and Find Full Text PDF

Background: Stillbirth occurs at a rate of 3.0 per thousand in Sweden. However, few studies have focused on the initial experiences of parents facing a stillbirth.

View Article and Find Full Text PDF

Key steps in exposure techniques for robotic total mesorectal excision (TME).

Tech Coloproctol

December 2024

Colorectal Surgery, Champalimaud Foundation, Av. Brasilia, 1400-038, Lisbon, Portugal.

Aim: The use of robotic surgery is increasing significantly. Specific training is fundamental to achieve high quality and better oncological outcomes. This work defines key exposure techniques in robotic total mesorectal excision (TME).

View Article and Find Full Text PDF

Background Preventive measures are critical in avoiding and limiting the severity of diseases. Key lifestyle behaviors include sleep hygiene, habitual exercise, a healthy diet, and avoidance of risky substances, particularly the use of tobacco. The transtheoretical model (TTM) of change suggests that patients can move towards healthful changes through education.

View Article and Find Full Text PDF

Objective: Effective operating room (OR) learning requires surgical and surgical-educational skills. Current insights into educational skills of surgical educators are derived from general perceptions of supervisors and residents via survey and interview studies. This study aims to provide insight into what educators and residents perceive as good OR supervision behavior based on actual day-to-day collaboration.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!