Improving Surgical Scene Semantic Segmentation through a Deep Learning Architecture with Attention to Class Imbalance.

Claudio Urrea Yainet Garcia-Garcia John Kern

Biomedicines

Electrical Engineering Department, Faculty of Engineering, University of Santiago of Chile, Las Sophoras 165, Estación Central, Santiago 9170020, Chile.

Published: June 2024

This article addresses the semantic segmentation of laparoscopic surgery images, placing special emphasis on the segmentation of structures with a smaller number of observations. As a result of this study, adjustment parameters are proposed for deep neural network architectures, enabling a robust segmentation of all structures in the surgical scene. The U-Net architecture with five encoder-decoders (U-Net5ed), SegNet-VGG19, and DeepLabv3+ employing different backbones are implemented. Three main experiments are conducted, working with Rectified Linear Unit (ReLU), Gaussian Error Linear Unit (GELU), and Swish activation functions. The applied loss functions include Cross Entropy (CE), Focal Loss (FL), Tversky Loss (TL), Dice Loss (DiL), Cross Entropy Dice Loss (CEDL), and Cross Entropy Tversky Loss (CETL). The performance of Stochastic Gradient Descent with momentum (SGDM) and Adaptive Moment Estimation (Adam) optimizers is compared. It is qualitatively and quantitatively confirmed that DeepLabv3+ and U-Net5ed architectures yield the best results. The DeepLabv3+ architecture with the ResNet-50 backbone, Swish activation function, and CETL loss function reports a Mean Accuracy (MAcc) of 0.976 and Mean Intersection over Union (MIoU) of 0.977. The semantic segmentation of structures with a smaller number of observations, such as the hepatic vein, cystic duct, Liver Ligament, and blood, verifies that the obtained results are very competitive and promising compared to the consulted literature. The proposed selected parameters were validated in the YOLOv9 architecture, which showed an improvement in semantic segmentation compared to the results obtained with the original architecture.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11201157	PMC
http://dx.doi.org/10.3390/biomedicines12061309	DOI Listing

Publication Analysis

Top Keywords

semantic segmentation

segmentation structures

cross entropy

surgical scene

structures smaller

smaller number

number observations

linear unit

swish activation

tversky loss

Similar Publications

Data-Efficient Bone Segmentation Using Feature Pyramid- Based SegFormer.

Sensors (Basel)

December 2024

Master's Program in Information and Computer Science, Doshisha University, Kyoto 610-0394, Japan.

Naohiro Masuda Keiko Ono Daisuke Tawara Yusuke Matsuura Kentaro Sakabe

The semantic segmentation of bone structures demands pixel-level classification accuracy to create reliable bone models for diagnosis. While Convolutional Neural Networks (CNNs) are commonly used for segmentation, they often struggle with complex shapes due to their focus on texture features and limited ability to incorporate positional information. As orthopedic surgery increasingly requires precise automatic diagnosis, we explored SegFormer, an enhanced Vision Transformer model that better handles spatial awareness in segmentation tasks.

View Article and Find Full Text PDF

Similar Publications

DSC-SeNet: Unilateral Network with Feature Enhancement and Aggregation for Real-Time Segmentation of Carbon Trace in the Oil-Immersed Transformer.

Sensors (Basel)

December 2024

State Grid Tianjin Electric Power Research Institute, Tianjin 300180, China.

Liqing Liu Hongxin Ji Junji Feng Xinghua Liu Chi Zhang

Large oil-immersed transformers have metal-enclosed shells, making it difficult to visually inspect the internal insulation condition. Visual inspection of internal defects is carried out using a self-developed micro-robot in this work. Carbon trace is the main visual characteristic of internal insulation defects.

View Article and Find Full Text PDF

Similar Publications

Health & Gait: a dataset for gait-based analysis.

Sci Data

January 2025

University of Cordoba, Department of Computing and Numerical Analysis, Córdoba, 14071, Spain.

Jorge Zafra-Palma Nuria Marín-Jiménez José Castro-Piñero Magdalena Cuenca-García Rafael Muñoz-Salinas

Acquiring gait metrics and anthropometric data is crucial for evaluating an individual's physical status. Automating this assessment process alleviates the burden on healthcare professionals and accelerates patient monitoring. Current automation techniques depend on specific, expensive systems such as OptoGait or MuscleLAB, which necessitate training and physical space.

View Article and Find Full Text PDF

Similar Publications

Enhanced diagnosis of pes planus and pes cavus using deep learning-based segmentation of weight-bearing lateral foot radiographs: a comparative observer study.

Biomed Eng Lett

January 2025

Department of Biomedical Engineering, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea.

Seung Min Ryu Keewon Shin Soo Wung Shin Sun Ho Lee Su Min Seo

Unlabelled: A weight-bearing lateral radiograph (WBLR) of the foot is a gold standard for diagnosing adult-acquired flatfoot deformity. However, it is difficult to measure the major axis of bones in WBLR without using auxiliary lines. Herein, we develop semantic segmentation with a deep learning model (DLm) on the WBLR of the foot for enhanced diagnosis of pes planus and pes cavus.

View Article and Find Full Text PDF

Similar Publications

EGNet: 3D Semantic Segmentation Through Point-Voxel-Mesh Data for Euclidean-Geodesic Feature Fusion.

Sensors (Basel)

December 2024

School of Computer Science and Technology, Changchun University of Science and Technology, Changchun 130022, China.

Qi Li Yu Song Xiaoqian Jin Yan Wu Hang Zhang

With the advancement of service robot technology, the demand for higher boundary precision in indoor semantic segmentation has increased. Traditional methods of extracting Euclidean features using point cloud and voxel data often neglect geodesic information, reducing boundary accuracy for adjacent objects and consuming significant computational resources. This study proposes a novel network, the Euclidean-geodesic network (EGNet), which uses point cloud-voxel-mesh data to characterize detail, contour, and geodesic features, respectively.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!