Action Recognition by an Attention-Aware Temporal Weighted Convolutional Neural Network.

Sensors (Basel)

Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an 710049, China.

Published: June 2018

Research in human action recognition has accelerated significantly since the introduction of powerful machine learning tools such as Convolutional Neural Networks (CNNs). However, effective and efficient methods for incorporation of temporal information into CNNs are still being actively explored in the recent literature. Motivated by the popular recurrent attention models in the research area of natural language processing, we propose the Attention-aware Temporal Weighted CNN (ATW CNN) for action recognition in videos, which embeds a visual attention model into a temporal weighted multi-stream CNN. This attention model is simply implemented as temporal weighting yet it effectively boosts the recognition performance of video representations. Besides, each stream in the proposed ATW CNN framework is capable of end-to-end training, with both network parameters and temporal weights optimized by stochastic gradient descent (SGD) with back-propagation. Our experimental results on the UCF-101 and HMDB-51 datasets showed that the proposed attention mechanism contributes substantially to the performance gains with the more discriminative snippets by focusing on more relevant video segments.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6069475PMC
http://dx.doi.org/10.3390/s18071979DOI Listing

Publication Analysis

Top Keywords

action recognition
12
temporal weighted
12
attention-aware temporal
8
convolutional neural
8
atw cnn
8
attention model
8
temporal
6
recognition attention-aware
4
weighted convolutional
4
neural network
4

Similar Publications

The safety and efficiency of assembly lines are critical to manufacturing, but human supervisors cannot oversee all activities simultaneously. This study addresses this challenge by performing a comparative study to construct an initial real-time, semi-supervised temporal action recognition setup for monitoring worker actions on assembly lines. Various feature extractors and localization models were benchmarked using a new assembly dataset, with the I3D model achieving an average mAP@IoU=0.

View Article and Find Full Text PDF

Instruction-induced modulation of the visual stream during gesture observation.

Neuropsychologia

January 2025

Neuroscience Area, SISSA, Trieste, Italy; Dipartimento di Medicina dei Sistemi, Università di Roma-Tor Vergata, Roma, Italy.

Although gesture observation tasks are believed to invariably activate the action-observation network (AON), we investigated whether the activation of different cognitive mechanisms when processing identical stimuli with different explicit instructions modulates AON activations. Accordingly, 24 healthy right-handed individuals observed gestures and they processed both the actor's moved hand (hand laterality judgment task, HT) and the meaning of the actor's gesture (meaning task, MT). The main brain-level result was that the HT (vs MT) differentially activated the left and right precuneus, the left inferior parietal lobe, the left and right superior parietal lobe, the middle frontal gyri bilaterally and the left precentral gyrus.

View Article and Find Full Text PDF

Unlabelled: The association of the pathogenesis of neurodegenerative diseases, depression, anxiety, and cognitive disorders with neurotrophin-3 deficiency determines the prospect of creating drugs with a similar mechanism of action. Since the use of full-length NT-3 is limited by unsatisfactory pharmacokinetic properties, the creation of low-molecular mimetics of neurotrophin-3 that are active when administered systemically is relevant. The Federal Research Center for Innovator and Emerging Biomedical and Pharmaceutical Technologies has created a dimeric dipeptide mimetic of the 4th loop of NT-3, hexamethylenediamide bis-(N-γ-oxybutyryl-L-glutamyl-L-asparagine) with the laboratory code GTS-302, which activates TrkC and TrkB receptors.

View Article and Find Full Text PDF

Visual Noise Mask for Human Point-Light Displays: A Coding-Free Approach.

NeuroSci

January 2025

Psychological Neuroscience Laboratory, Psychology Research Center, School of Psychology, University of Minho, Rua da Universidade, 4710-057 Braga, Portugal.

Human point-light displays consist of luminous dots representing human articulations, thus depicting actions without pictorial information. These stimuli are widely used in action recognition experiments. Because humans excel in decoding human motion, point-light displays (PLDs) are often masked with additional moving dots (noise masks), thereby challenging stimulus recognition.

View Article and Find Full Text PDF

Introduction: An effective vaccination policy must be implemented to prevent foot-and-mouth disease (FMD). However, the currently used vaccines for FMD have several limitations, including induction of humoral rather than cellular immune responses.

Methods: To overcome these shortcomings, we assessed the efficacy of levamisole, a small-molecule immunomodulator, as an adjuvant for the FMD vaccine.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!