Semantic segmentation is a key step in scene understanding for autonomous driving. Although deep learning has significantly improved the segmentation accuracy, current highquality models such as PSPNet and DeepLabV3 are inefficient given their complex architectures and reliance on multi-scale inputs. Thus, it is difficult to apply them to real-time or practical applications. On the other hand, existing real-time methods cannot yet produce satisfactory results on small objects such as traffic lights, which are imperative to safe autonomous driving. In this paper, we improve the performance of real-time semantic segmentation from two perspectives, methodology and data. Specifically, we propose a real-time segmentation model coined Narrow Deep Network (NDNet) and build a synthetic dataset by inserting additional small objects into the training images. The proposed method achieves 65.7% mean intersection over union (mIoU) on the Cityscapes test set with only 8.4G floatingpoint operations (FLOPs) on 1024×2048 inputs. Furthermore, by re-training the existing PSPNet and DeepLabV3 models on our synthetic dataset, we obtained an average 2% mIoU improvement on small objects.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2020.2976856DOI Listing

Publication Analysis

Top Keywords

semantic segmentation
12
small objects
12
real-time semantic
8
autonomous driving
8
pspnet deeplabv3
8
synthetic dataset
8
real-time
5
segmentation
5
small
4
small object
4

Similar Publications

Robust perception systems allow farm robots to recognize weeds and vegetation, enabling the selective application of fertilizers and herbicides to mitigate the environmental impact of traditional agricultural practices. Today's perception systems typically rely on deep learning to interpret sensor data for tasks such as distinguishing soil, crops, and weeds. These approaches usually require substantial amounts of manually labeled training data, which is often time-consuming and requires domain expertise.

View Article and Find Full Text PDF

Accurate segmentation of organs or lesions from medical images is essential for accurate disease diagnosis and organ morphometrics. Previously, most researchers mainly added feature extraction modules and simply aggregated the semantic features to U-Net network to improve the segmentation accuracy of medical images. However, these improved U-Net networks ignore the semantic differences of different organs in medical images and lack the fusion of high-level semantic features and low-level semantic features, which will lead to blurred or miss boundaries between similar organs and diseased areas.

View Article and Find Full Text PDF

SAMP-Net: a medical image segmentation network with split attention and multi-layer perceptron.

Med Biol Eng Comput

March 2025

School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, No.15, Yongyuan Road, Huangcun Town, Daxing District, Beijing, 102616, China.

Convolutional neural networks (CNNs) have achieved remarkable success in computer vision, particularly in medical image segmentation. U-Net, a prominent architecture, marked a major breakthrough and remains widely used in practice. However, its uniform downsampling strategy and simple stacking of convolutional layers in the encoder limit its ability to capture rich features at multiple depths, reducing its efficiency for rapid image processing.

View Article and Find Full Text PDF

At an abstract temporospatial level, object-directed actions can be described as sequences of touchings and untouchings of objects, hands, and the ground. These sparse action codes can effectively guide automated systems like robots in recognizing and responding to human actions without the need for object identification. The aim of the current study was to investigate whether the neural processing of actions and their behavioral classification relies on the action categorization derived from the touching-untouching structure.

View Article and Find Full Text PDF

Automatic cleaning of carbon blocks based on machine vision is currently an important aspect of industrial intelligent applications. The recognition of carbon block types and center point localization are the core contents of this task, but existing instance segmentation algorithms perform poorly in this task. This paper proposes an industrial carbon block instance segmentation algorithm based on improved YOLOv8 (YOLOv8-HDSA), which achieves highly accurate recognition of carbon block types and edge segmentation.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!