Two-stream spatio-temporal GCN-transformer networks for skeleton-based action recognition.

Sci Rep

Guangxi Key Laboratory of Functional Information Materials and Intelligent Information Processing, Nanning, 530000, China.

Published: February 2025

For the purpose of achieving accurate skeleton-based action recognition, the majority of prior approaches have adopted a serial strategy that combines Graph Convolutional Networks (GCNs) with attention-based methods. However, this approach frequently treats the human skeleton as an isolated and complete structure, neglecting the significance of highly correlated yet indirectly connected skeletal parts, finally hindering recognition accuracy. This study proposes a novel architecture addressing this limitation by implementing a parallel configuration of GCNs and the Transformer model (SA-TDGFormer). This parallel structure integrates the advantages of both the GCN model and the Transformer model, facilitating the extraction of both local and global spatio-temporal features, leading to more accurate motion information encoding and improved recognition performance. The proposed model distinguishes itself through its dual-stream structure: a spatiotemporal GCN stream and a spatiotemporal Transformer stream. The former focuses on capturing the topological structure and motion representations of human skeletons. In contrast, the latter seeks to capture motion representations that consist of global inter-joint relationships. Recognizing the unique feature representations generated by these streams and their limited mutual understanding, the model also incorporates a late fusion strategy to merge the results from the two streams. This fusion allows the spatiotemporal GCN and Transformer streams to complement each other, enriching action features and maximizing information exchange between the two representation types. Empirical validation on three established benchmark datasets, NTU RGB + D 60, NTU RGB + D 120, and Kinetics-Skeleton, substantiates the model's effectiveness. The experimental results indicate that, compared to existing classification frameworks, the method proposed in this paper improves the accuracy of human action recognition by 1-5% (NTU RGB + D 60 dataset). This improvement demonstrates the superior performance of the model in action recognition.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11811230PMC
http://dx.doi.org/10.1038/s41598-025-87752-8DOI Listing

Publication Analysis

Top Keywords

action recognition
16
ntu rgb + d
12
skeleton-based action
8
transformer model
8
spatiotemporal gcn
8
motion representations
8
recognition
6
model
6
action
5
two-stream spatio-temporal
4

Similar Publications

Kindling is an experimental-induced seizure consistent with epilepsy disease, a chronic neurological disorder characterised by spontaneous and repeated seizures. This disease is associated with oxidative stress, and most therapeutic strategies against epilepsy aim at improving the antioxidant defence mechanism in the brain. However, prolonged usage and associated adverse side effects limit antiepileptics, warranting natural antioxidant patronage.

View Article and Find Full Text PDF

There is a growing recognition of the importance of familial involvement in patient care. In Asian societies, communications with patients' families for routine medical updates and shared decision-making are considered part-and-parcel of clinical practice. Yet, training in familial communications has remained, by and far, a neglected aspect of conventional communications skills training in the medical curriculum, despite distinctive nuances in the communications approach.

View Article and Find Full Text PDF

Coercive control is a widespread globally prevalent and often missed pattern of intimate partner violence (IPV) that increases the risk of physical disease and mental illness tremendously for its victims, usually women and children. Besides it can lead to femicide and infanticide when red flags are being ignored. Here we describe an illustrative case.

View Article and Find Full Text PDF

Stakeholder perceptions of dementia in Colombia: a qualitative study.

BMC Public Health

March 2025

Escuela de Gobierno Alberto Lleras Camargo, Universidad de los Andes, Bogotá, Colombia.

Background: The global rise in dementia prevalence poses a significant public health challenge, particularly in low- and middle-income countries where resources for diagnosis, treatment, and support are constrained. Addressing this issue, the World Health Organization's 2017-2025 global action plan on dementia envisions a future where dementia is preventable, and individuals with dementia and their caregivers receive dignified support.

Methods: Using a qualitative research design, this study explores stakeholder perspectives on dementia in Colombia, framed by the World Health Organization's global action plan.

View Article and Find Full Text PDF

The Beginning of a "Regulatory Renaissance": Positioning Regulatory Coverage at the Interface of Human Expertise and Digital Support.

Ther Innov Regul Sci

March 2025

U.S. Food and Drug Administration, White Oak Campus 10903 New Hampshire Ave, Silver Spring, Maryland, MD, 20993, USA.

Following the largest reorganization in its history, the U.S. Food and Drug Administration (FDA) is now working to modernize how it defines and engages in regulatory oversight of the quality of products that the agency regulates.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!