Deformable medical image registration is an essential preprocess step for several clinical applications. Even though the existing convolutional neural network and transformer based methods achieved the promising results, the limited long-range spatial dependence and non-uniform attention span of these models prohibit further improving the registration performance. To deal with this issue, we proposed a multi-dilation spherical graph transformer (MD-SGT), in which the encoder combined the advantages of convolutional and graph transformer blocks to distinguish effectively the differences between the reference and the template images at various scales. Specifically, the features of each voxel were obtained by aggregating the information from its neighbors sampled from different spherical regions with different dilation rates. The implicit convolution inductive bias and long-range uniform attention span induced by such information aggregation manner made the features more representative for registration. Through the qualitative and quantitative comparisons with state-of-the-art methods on two datasets, we demonstrated that combining long-range uniform attention span and inductive bias are beneficial for promoting the image registration performance, with the Dice score, ASD and HD95 being improved at least by 0.5%, 2.2% and 1.1%, respectively.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.compmedimag.2023.102281 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!