Multimodal Fusion Network for 3-D Lane Detection.

Taiheng Liu Guang-Zhong Cao Zhaoshui He Shengli Xie Xiuqin Deng

IEEE Trans Neural Netw Learn Syst

Published: May 2024

3-D lane detection is a challenging task due to the diversity of lanes, occlusion, dazzle light, and so on. Traditional methods usually use highly specialized handcrafted features and carefully designed postprocessing to detect them. However, these methods are based on strong assumptions and single modal so that they are easily scalable and have poor performance. In this article, a multimodal fusion network (MFNet) is proposed through using multihead nonlocal attention and feature pyramid for 3-D lane detection. It includes three parts: multihead deformable transformation (MDT) module, multidirectional attention feature pyramid fusion (MA-FPF) module, and top-view lane prediction (TLP) ones. First, MDT is presented to learn and mine multimodal features from RGB images, depth maps, and point cloud data (PCD) for achieving optimal lane feature extraction. Then, MA-FPF is designed to fuse multiscale features for presenting the vanish of lane features as the network deepens. Finally, TLP is developed to estimate 3-D lanes and predict their position. Experimental results on the 3-D lane synthetic and ONCE-3DLanes datasets demonstrate that the performance of the proposed MFNet outperforms the state-of-the-art methods in both qualitative and quantitative analyses and visual comparisons.

Download full-text PDF	Source
http://dx.doi.org/10.1109/TNNLS.2024.3398654	DOI Listing

Publication Analysis

Top Keywords

3-d lane

lane detection

multimodal fusion

fusion network

attention feature

feature pyramid

lane

3-d

network 3-d

detection 3-d

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!