AI Article Synopsis

  • Most studies on AI for surgical activity recognition have focused narrowly on single activities and small data sets, raising questions about their general applicability across different surgical centers.
  • This research introduces a comprehensive dataset called MultiBypass140, which includes 140 laparoscopic Roux-en-Y gastric bypass videos from two different hospitals, annotated by professional surgeons to enhance training and evaluation accuracy.
  • Findings indicate that training AI models on varied data from multiple centers significantly enhances their performance and generalization, highlighting the limitations of mono-centric training approaches.

Article Abstract

Purpose: Most studies on surgical activity recognition utilizing artificial intelligence (AI) have focused mainly on recognizing one type of activity from small and mono-centric surgical video datasets. It remains speculative whether those models would generalize to other centers.

Methods: In this work, we introduce a large multi-centric multi-activity dataset consisting of 140 surgical videos (MultiBypass140) of laparoscopic Roux-en-Y gastric bypass (LRYGB) surgeries performed at two medical centers, i.e., the University Hospital of Strasbourg, France (StrasBypass70) and Inselspital, Bern University Hospital, Switzerland (BernBypass70). The dataset has been fully annotated with phases and steps by two board-certified surgeons. Furthermore, we assess the generalizability and benchmark different deep learning models for the task of phase and step recognition in 7 experimental studies: (1) Training and evaluation on BernBypass70; (2) Training and evaluation on StrasBypass70; (3) Training and evaluation on the joint MultiBypass140 dataset; (4) Training on BernBypass70, evaluation on StrasBypass70; (5) Training on StrasBypass70, evaluation on BernBypass70; Training on MultiBypass140, (6) evaluation on BernBypass70 and (7) evaluation on StrasBypass70.

Results: The model's performance is markedly influenced by the training data. The worst results were obtained in experiments (4) and (5) confirming the limited generalization capabilities of models trained on mono-centric data. The use of multi-centric training data, experiments (6) and (7), improves the generalization capabilities of the models, bringing them beyond the level of independent mono-centric training and validation (experiments (1) and (2)).

Conclusion: MultiBypass140 shows considerable variation in surgical technique and workflow of LRYGB procedures between centers. Therefore, generalization experiments demonstrate a remarkable difference in model performance. These results highlight the importance of multi-centric datasets for AI model generalization to account for variance in surgical technique and workflows. The dataset and code are publicly available at https://github.com/CAMMA-public/MultiBypass140.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11541311PMC
http://dx.doi.org/10.1007/s11548-024-03166-3DOI Listing

Publication Analysis

Top Keywords

training evaluation
12
evaluation bernbypass70
12
training
9
phase step
8
step recognition
8
roux-en-y gastric
8
gastric bypass
8
university hospital
8
bernbypass70 training
8
evaluation strasbypass70
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!