Point cloud instance segmentation has achieved huge progress with the emergence of deep learning. However, these methods are usually data-hungry with expensive and time-consuming dense point cloud annotations. To alleviate the annotation cost, unlabeled or weakly labeled data is still less explored in the task. In this paper, we introduce the first semi-supervised point cloud instance segmentation framework (SPIB) using both labeled and unlabelled bounding boxes as supervision. To be specific, our SPIB architecture involves a two-stage learning procedure. For stage one, a bounding box proposal generation network is trained under a semi-supervised setting with perturbation consistency regularization (SPCR). The regularization works by enforcing an invariance of the bounding box predictions over different perturbations applied to the input point clouds, to provide self-supervision for network learning. For stage two, the bounding box proposals with SPCR are grouped into some subsets, and the instance masks are mined inside each subset with a novel semantic propagation module and a property consistency graph module. Moreover, we introduce a novel occupancy ratio guided refinement module to refine the instance masks. Extensive experiments on the challenging ScanNet v2 dataset demonstrate our method can achieve competitive performance compared with the recent fully-supervised methods.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2021.3131120DOI Listing

Publication Analysis

Top Keywords

point cloud
16
cloud instance
12
instance segmentation
12
bounding box
12
stage bounding
8
instance masks
8
point
5
instance
5
segmentation semi-supervised
4
semi-supervised bounding-box
4

Similar Publications

Although the Transformer architecture has established itself as the industry standard for jobs involving natural language processing, it still has few uses in computer vision. In vision, attention is used in conjunction with convolutional networks or to replace individual convolutional network elements while preserving the overall network design. Differences between the two domains, such as significant variations in the scale of visual things and the higher granularity of pixels in images compared to words in the text, make it difficult to transfer Transformer from language to vision.

View Article and Find Full Text PDF

Cross-Modal Collaboration and Robust Feature Classifier for Open-Vocabulary 3D Object Detection.

Sensors (Basel)

January 2025

The 54th Research Institute, China Electronics Technology Group Corporation, College of Signal and Information Processing, Shijiazhuang 050081, China.

The multi-sensor fusion, such as LiDAR and camera-based 3D object detection, is a key technology in autonomous driving and robotics. However, traditional 3D detection models are limited to recognizing predefined categories and struggle with unknown or novel objects. Given the complexity of real-world environments, research into open-vocabulary 3D object detection is essential.

View Article and Find Full Text PDF

Segment Any Leaf 3D: A Zero-Shot 3D Leaf Instance Segmentation Method Based on Multi-View Images.

Sensors (Basel)

January 2025

School of Electronic and Communication Engineering, Sun Yat-sen University, Shenzhen 518000, China.

Exploring the relationships between plant phenotypes and genetic information requires advanced phenotypic analysis techniques for precise characterization. However, the diversity and variability of plant morphology challenge existing methods, which often fail to generalize across species and require extensive annotated data, especially for 3D datasets. This paper proposes a zero-shot 3D leaf instance segmentation method using RGB sensors.

View Article and Find Full Text PDF

Topography estimation is essential for autonomous off-road navigation. Common methods rely on point cloud data from, e.g.

View Article and Find Full Text PDF

Terrestrial laser scanners (TLS) are portable dimensional measurement instruments used to obtain 3D point clouds of objects in a scene. While TLSs do not require the use of cooperative targets, they are sometimes placed in a scene to fuse or compare data from different instruments or data from the same instrument but from different positions. A contrast target is an example of such a target; it consists of alternating black/white squares that can be printed using a laser printer.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!