LSKANet: Long Strip Kernel Attention Network for Robotic Surgical Scene Segmentation.

Min Liu Yubin Han Jiazheng Wang Can Wang Yaonan Wang Erik Meijering

IEEE Trans Med Imaging

Published: April 2024

Surgical scene segmentation is essential for robotic-assisted surgeries but faces challenges due to complex features, artifacts, and boundary indistinctness.
A newly proposed model, LSKANet, incorporates two innovative modules (DLKA and MAFF) designed to enhance segmentation accuracy by leveraging strip convolutions and multiscale features to effectively manage local similarities and artifacts.
Testing on three surgical datasets indicates that LSKANet outperforms previous methods, achieving an improvement in mean Intersection over Union (mIoU) scores and demonstrating compatibility with various backbone architectures for better segmentation results.

Surgical scene segmentation is a critical task in Robotic-assisted surgery. However, the complexity of the surgical scene, which mainly includes local feature similarity (e.g., between different anatomical tissues), intraoperative complex artifacts, and indistinguishable boundaries, poses significant challenges to accurate segmentation. To tackle these problems, we propose the Long Strip Kernel Attention network (LSKANet), including two well-designed modules named Dual-block Large Kernel Attention module (DLKA) and Multiscale Affinity Feature Fusion module (MAFF), which can implement precise segmentation of surgical images. Specifically, by introducing strip convolutions with different topologies (cascaded and parallel) in two blocks and a large kernel design, DLKA can make full use of region- and strip-like surgical features and extract both visual and structural information to reduce the false segmentation caused by local feature similarity. In MAFF, affinity matrices calculated from multiscale feature maps are applied as feature fusion weights, which helps to address the interference of artifacts by suppressing the activations of irrelevant regions. Besides, the hybrid loss with Boundary Guided Head (BGH) is proposed to help the network segment indistinguishable boundaries effectively. We evaluate the proposed LSKANet on three datasets with different surgical scenes. The experimental results show that our method achieves new state-of-the-art results on all three datasets with improvements of 2.6%, 1.4%, and 3.4% mIoU, respectively. Furthermore, our method is compatible with different backbones and can significantly increase their segmentation accuracy. Code is available at https://github.com/YubinHan73/LSKANet.

Download full-text PDF	Source
http://dx.doi.org/10.1109/TMI.2023.3335406	DOI Listing

Publication Analysis

Top Keywords

kernel attention

surgical scene

long strip

strip kernel

attention network

scene segmentation

segmentation surgical

local feature

feature similarity

indistinguishable boundaries

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!