Based on deep learning, monocular visual 3D reconstruction methods have been applied in various conventional fields. In the aspect of medical endoscopic imaging, due to the difficulty in obtaining real information, self-supervised deep learning has always been a focus of research. However, current research on endoscopic 3D reconstruction is mainly conducted in laboratory environments, lacking experience in dealing with complex clinical surgical environments. In this work, we use an optical flow-based neural network to address the problem of inconsistent brightness between frames. Additionally, attention modules and inter-layer losses are introduced to tackle the complexity of endoscopic scenes in clinical surgeries. The attention mechanism allows the network to better focus on pixel texture details and depth differences, while the inter-layer losses supervise the network at different scales. We have established a complete monocular endoscopic 3D reconstruction framework and conducted quantitative experiments on a clinical dataset using the cross-correlation coefficient as a metric. Compared with other self-supervised methods, our framework can better simulate the mapping relationship between adjacent frames during endoscope motion. To validate the generalization performance of our framework, we tested the model trained on the clinical dataset on the SCARED dataset and achieved equally excellent results.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10713920 | PMC |
http://dx.doi.org/10.1007/s13755-023-00262-7 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!