Structure-from-Motion (SfM) aims to recover 3D scene structures and camera poses based on the correspondences between input images, and thus the ambiguity caused by duplicate structures (i.e., different structures with strong visual resemblance) always results in incorrect camera poses and 3D structures. To deal with the ambiguity, most existing studies resort to additional constraint information or implicit inference by analyzing two-view geometries or feature points. In this paper, we propose to exploit high-level information in the scene, i.e., the spatial contextual information of local regions, to guide the reconstruction. Specifically, a novel structure is proposed, namely, track-community, in which each community consists of a group of tracks and represents a local segment in the scene. A community detection algorithm is performed on the track-graph to partition the scene into segments. Then, the potential ambiguous segments are detected by analyzing the neighborhood of tracks and corrected by checking the pose consistency. Finally, we perform partial reconstruction on each segment and align them with a novel bidirectional consistency cost function which considers both 3D-3D correspondences and pairwise relative camera poses. Experimental results demonstrate that our approach can robustly alleviate reconstruction failure resulting from visually indistinguishable structures and accurately merge the partial reconstructions.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2024.3364843DOI Listing

Publication Analysis

Top Keywords

camera poses
12
structures
5
tc-sfm robust
4
robust track-community-based
4
track-community-based structure-from-motion
4
structure-from-motion structure-from-motion
4
structure-from-motion sfm
4
sfm aims
4
aims recover
4
scene
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!