Graph Neural Networks (GNNs) are neural networks that learn the representation of nodes and associated edges that connect it to every other node while maintaining graph representation. Graph Convolutional Neural Networks (GCNs), as a representative method in GNNs, in the context of computer vision, utilize conventional Convolutional Neural Networks (CNNs) to process data supported by graphs. This paper proposes a one-stage GCN approach for 3D object detection and poses estimation by structuring non-linearly distributed points of a graph. Our network provides the required details to analyze, generate and estimate bounding boxes by spatially structuring the input data into graphs. Our method proposes a keypoint attention mechanism that aggregates the relative features between each point to estimate the category and pose of the object to which the vertices of the graph belong, and also designs nine degrees of freedom of multi-object pose estimation. In addition, to avoid gimbal lock in 3D space, we use quaternion rotation, instead of Euler angle. Experimental results showed that memory usage and efficiency could be improved by aggregating point features from the point cloud and their neighbors in a graph structure. Overall, the system achieved comparable performance against state-of-the-art systems.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9656959 | PMC |
http://dx.doi.org/10.3390/s22218166 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!