MP-FocalUNet: Multiscale parallel focal self-attention U-Net for medical image segmentation.

Comput Methods Programs Biomed

Cardiovascular Research Centre, Royal Brompton Hospital, London SW3 6NP, United Kingdom; National Heart and Lung Institute, Imperial College London, London SW7 2AZ, United Kingdom.

Published: December 2024

AI Article Synopsis

  • * The proposed MP-FocalUNet uses a dual-scale sub-network structure to extract information at different scales and incorporates a "Feature Fusion" module for enhanced representation, while a focal self-attention mechanism aids in capturing global dependencies.
  • * Testing on various medical datasets shows MP-FocalUNet outperforms existing methods, achieving an average Dice score of 82.45% for abdominal organ segmentation and 91.44% for cardiac diagnosis, marking significant improvements in performance.

Article Abstract

Background And Objective: Medical image segmentation has been significantly improved in recent years with the progress of Convolutional Neural Networks (CNNs). Due to the inherent limitations of convolutional operations, CNNs perform poorly in learning the correlation information between global and long-range features. To solve this problem, some existing solutions rely on building deep encoders and down-sampling operations, but such methods are prone to produce redundant network structures and lose local details. Therefore, medical image segmentation tasks require better solutions to improve the modeling of the global context, while maintaining a strong grasp of the low-level details.

Methods: We propose a novel multiscale parallel branch architecture (MP-FocalUNet). On the encoder side of MP-FocalUNet, dual-scale sub-networks are used to extract information of different scales. A cross-scale "Feature Fusion" (FF) module was proposed to explore the potential of dual branch networks and fully utilize feature representations at different scales. On the decoder side, combined with the traditional CNN in parallel, focal self-attention is used for long-distance modeling, which can effectively capture the global dependencies and underlying spatial details in a shallower way.

Results: Our proposed method is evaluated on both abdominal organ segmentation datasets and automatic cardiac diagnosis challenge datasets. Our method consistently outperforms several state-of-the-art segmentation methods with an average Dice score of 82.45 % (2.68 % higher than HC-Net) and 91.44 % (0.35 % higher than HC-Net) on the abdominal organ datasets and the automatic cardiac diagnosis challenge datasets, respectively.

Conclusions: Our MP-FocalUNet is a novel encoder-decoder based multiscale parallel branch Transformer network, which solves the problem of insufficient long-distance modeling in CNNs and fuses image information at different scales. Extensive experiments on abdominal and cardiac medical image segmentation tasks show that our MP-FocalUNet outperforms other state-of-the-art methods. In the future, our work will focus on designing more lightweight Transformer-based models and better learning pixel-level intrinsic structural features generated by patch division in visual Transformers.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cmpb.2024.108562DOI Listing

Publication Analysis

Top Keywords

medical image
16
image segmentation
16
multiscale parallel
12
parallel focal
8
focal self-attention
8
segmentation tasks
8
parallel branch
8
long-distance modeling
8
abdominal organ
8
datasets automatic
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!