Automatic skin lesion analysis in terms of skin lesion segmentation and disease classification is of great importance. However, these two tasks are challenging as skin lesion images of multi-ethnic population are collected using various scanners in multiple international medical institutes. To address them, most recent works adopt convolutional neural networks (CNNs) for skin lesion analysis. However, due to the intrinsic locality of the convolution operator, CNNs lack the ability to capture contextual information and long-range dependency. To improve the baseline performance established by CNNs, we propose a Fully Transformer Network (FTN) to learn long-range contextual information for skin lesion analysis. FTN is a hierarchical Transformer computing features using Spatial Pyramid Transformer (SPT). SPT has linear computational complexity as it introduces a spatial pyramid pooling (SPP) module into multi-head attention (MHA)to largely reduce the computation and memory usage. We conduct extensive skin lesion analysis experiments to verify the effectiveness and efficiency of FTN using ISIC 2018 dataset. Our experimental results show that FTN consistently outperforms other state-of-the-art CNNs in terms of computational efficiency and the number of tunable parameters due to our efficient SPT and hierarchical network structure. The code and models will be public available at: https://github.com/Novestars/Fully-Transformer-Network.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.media.2022.102357 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!