Trans-SVNet: hybrid embedding aggregation Transformer for surgical workflow analysis.

Int J Comput Assist Radiol Surg

Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, HK, China.

Published: December 2022

Purpose: Real-time surgical workflow analysis has been a key component for computer-assisted intervention system to improve cognitive assistance. Most existing methods solely rely on conventional temporal models and encode features with a successive spatial-temporal arrangement. Supportive benefits of intermediate features are partially lost from both visual and temporal aspects. In this paper, we rethink feature encoding to attend and preserve the critical information for accurate workflow recognition and anticipation.

Methods: We introduce Transformer in surgical workflow analysis, to reconsider complementary effects of spatial and temporal representations. We propose a hybrid embedding aggregation Transformer, named Trans-SVNet, to effectively interact with the designed spatial and temporal embeddings, by employing spatial embedding to query temporal embedding sequence. We jointly optimized by loss objectives from both analysis tasks to leverage their high correlation.

Results: We extensively evaluate our method on three large surgical video datasets. Our method consistently outperforms the state-of-the-arts across three datasets on workflow recognition task. Jointly learning with anticipation, recognition results can gain a large improvement. Our approach also shows its effectiveness on anticipation with promising performance achieved. Our model achieves a real-time inference speed of 0.0134 second per frame.

Conclusion: Experimental results demonstrate the efficacy of our hybrid embeddings integration by rediscovering the crucial cues from complementary spatial-temporal embeddings. The better performance by multi-task learning indicates that anticipation task brings the additional knowledge to recognition task. Promising effectiveness and efficiency of our method also show its promising potential to be used in operating room.

Download full-text PDF

Source
http://dx.doi.org/10.1007/s11548-022-02743-8DOI Listing

Publication Analysis

Top Keywords

surgical workflow
12
workflow analysis
12
hybrid embedding
8
embedding aggregation
8
aggregation transformer
8
transformer surgical
8
workflow recognition
8
spatial temporal
8
recognition task
8
workflow
5

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!