Semantic segmentation is a key step in scene understanding for autonomous driving. Although deep learning has significantly improved the segmentation accuracy, current highquality models such as PSPNet and DeepLabV3 are inefficient given their complex architectures and reliance on multi-scale inputs. Thus, it is difficult to apply them to real-time or practical applications. On the other hand, existing real-time methods cannot yet produce satisfactory results on small objects such as traffic lights, which are imperative to safe autonomous driving. In this paper, we improve the performance of real-time semantic segmentation from two perspectives, methodology and data. Specifically, we propose a real-time segmentation model coined Narrow Deep Network (NDNet) and build a synthetic dataset by inserting additional small objects into the training images. The proposed method achieves 65.7% mean intersection over union (mIoU) on the Cityscapes test set with only 8.4G floatingpoint operations (FLOPs) on 1024×2048 inputs. Furthermore, by re-training the existing PSPNet and DeepLabV3 models on our synthetic dataset, we obtained an average 2% mIoU improvement on small objects.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TIP.2020.2976856 | DOI Listing |
Front Robot AI
February 2025
Center for Robotics, University of Bonn, Bonn, Germany.
Robust perception systems allow farm robots to recognize weeds and vegetation, enabling the selective application of fertilizers and herbicides to mitigate the environmental impact of traditional agricultural practices. Today's perception systems typically rely on deep learning to interpret sensor data for tasks such as distinguishing soil, crops, and weeds. These approaches usually require substantial amounts of manually labeled training data, which is often time-consuming and requires domain expertise.
View Article and Find Full Text PDFSci Rep
March 2025
College of Pharmacy, Qiqihar Medical University, Qiqihar, 161003, China.
Accurate segmentation of organs or lesions from medical images is essential for accurate disease diagnosis and organ morphometrics. Previously, most researchers mainly added feature extraction modules and simply aggregated the semantic features to U-Net network to improve the segmentation accuracy of medical images. However, these improved U-Net networks ignore the semantic differences of different organs in medical images and lack the fusion of high-level semantic features and low-level semantic features, which will lead to blurred or miss boundaries between similar organs and diseased areas.
View Article and Find Full Text PDFMed Biol Eng Comput
March 2025
School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, No.15, Yongyuan Road, Huangcun Town, Daxing District, Beijing, 102616, China.
Convolutional neural networks (CNNs) have achieved remarkable success in computer vision, particularly in medical image segmentation. U-Net, a prominent architecture, marked a major breakthrough and remains widely used in practice. However, its uniform downsampling strategy and simple stacking of convolutional layers in the encoder limit its ability to capture rich features at multiple depths, reducing its efficiency for rapid image processing.
View Article and Find Full Text PDFNeuroimage
March 2025
Department of Psychology, University of Münster, Germany; Otto Creutzfeldt Center for Cognitive and Behavioral Neuroscience, University of Münster, Germany.
At an abstract temporospatial level, object-directed actions can be described as sequences of touchings and untouchings of objects, hands, and the ground. These sparse action codes can effectively guide automated systems like robots in recognizing and responding to human actions without the need for object identification. The aim of the current study was to investigate whether the neural processing of actions and their behavioral classification relies on the action categorization derived from the touching-untouching structure.
View Article and Find Full Text PDFSci Rep
March 2025
Chinese Academy of Fishery Sciences, Yellow Sea Fisheries Research Institute, Qingdao, 266071, China.
Automatic cleaning of carbon blocks based on machine vision is currently an important aspect of industrial intelligent applications. The recognition of carbon block types and center point localization are the core contents of this task, but existing instance segmentation algorithms perform poorly in this task. This paper proposes an industrial carbon block instance segmentation algorithm based on improved YOLOv8 (YOLOv8-HDSA), which achieves highly accurate recognition of carbon block types and edge segmentation.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!