Gastrointestinal (GI) disease examination presents significant challenges to doctors due to the intricate structure of the human digestive system. Colonoscopy and wireless capsule endoscopy are the most commonly used tools for GI examination. However, the large amount of data generated by these technologies requires the expertise and intervention of doctors for disease identification, making manual analysis a very time-consuming task. Thus, the development of a computer-assisted system is highly desirable to assist clinical professionals in making decisions in a low-cost and effective way. In this paper, we introduce a novel framework called InCoLoTransNet, designed for polyp segmentation. The study is based on a transformer and convolution-involution neural network, following the encoder-decoder architecture. We employed the vision transformer in the encoder section to focus on the global context, while the decoder involves a convolution-involution collaboration for resampling the polyp features. Involution enhances the model's ability to adaptively capture spatial and contextual information, while convolution focuses on local information, leading to more accurate feature extraction. The essential features captured by the transformer encoder are passed to the decoder through two skip connection pathways. The CBAM module refines the features and passes them to the convolution block, leveraging attention mechanisms to emphasize relevant information. Meanwhile, locality self-attention is employed to pass essential features to the involution block, reinforcing the model's ability to capture more global features in the polyp regions. Experiments were conducted on five public datasets: CVC-ClinicDB, CVC-ColonDB, Kvasir-SEG, Etis-LaribPolypDB, and CVC-300. The results obtained by InCoLoTransNet are optimal when compared with 15 state-of-the-art methods for polyp segmentation, achieving the highest mean dice score of 93% on CVC-ColonDB and 90% on mean intersection over union, outperforming the state-of-the-art methods. Additionally, InCoLoTransNet distinguishes itself in terms of polyp segmentation generalization performance. It achieved high scores in mean dice coefficient and mean intersection over union on unseen datasets as follows: 85% and 79% on CVC-ColonDB, 91% and 87% on CVC-300, and 79% and 70% on Etis-LaribPolypDB, respectively.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1007/s10278-025-01389-7 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!