Purpose: In laparoscopic surgery, accurate 3D reconstruction from endoscopic video is crucial for effective image-guided techniques. Current methods for monocular depth estimation (MDE) face challenges in complex surgical scenes, including limited training data, specular reflections, and varying illumination conditions.
Methods: We propose SfMDiffusion, a novel diffusion-based self-supervised framework for MDE. Our approach combines: (1) a denoising diffusion process guided by pseudo-ground-truth depth maps, (2) knowledge distillation from a pre-trained teacher model, and (3) discriminative priors to enhance estimation robustness. Our design enables accurate depth estimation without requiring ground-truth depth data during training.
Results: Experiments on the SCARED and Hamlyn datasets demonstrate that SfMDiffusion achieves superior performance: an Absolute relative error (Abs Rel) of 0.049, a Squared relative error (Sq Rel) of 0.366, and a Root Mean Square Error (RMSE) of 4.305 on SCARED dataset, and Abs Rel of 0.067, Sq Rel of 0.800, and RMSE of 7.465 on Hamlyn dataset.
Conclusion: SfMDiffusion provides an innovative approach for 3D reconstruction in image-guided surgical techniques. Future work will focus on computational optimization and validation across diverse surgical scenarios. Our code is available at https://github.com/Skylanding/SfM-Diffusion .
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1007/s11548-025-03333-0 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!