Optimal Configuration of Multi-Task Learning for Autonomous Driving.

Woomin Jun Minjun Son Jisang Yoo Sungjin Lee

Sensors (Basel)

Electronic Engineering, Dong Seoul University, Seongnam 13117, Republic of Korea.

Published: December 2023

For autonomous driving, it is imperative to perform various high-computation image recognition tasks with high accuracy, utilizing diverse sensors to perceive the surrounding environment. Specifically, cameras are used to perform lane detection, object detection, and segmentation, and, in the absence of lidar, tasks extend to inferring 3D information through depth estimation, 3D object detection, 3D reconstruction, and SLAM. However, accurately processing all these image recognition operations in real-time for autonomous driving under constrained hardware conditions is practically unfeasible. In this study, considering the characteristics of image recognition tasks performed by these sensors and the given hardware conditions, we investigated MTL (multi-task learning), which enables parallel execution of various image recognition tasks to maximize their processing speed, accuracy, and memory efficiency. Particularly, this study analyzes the combinations of image recognition tasks for autonomous driving and proposes the MDO (multi-task decision and optimization) algorithm, consisting of three steps, as a means for optimization. In the initial step, a MTS (multi-task set) is selected to minimize overall latency while meeting minimum accuracy requirements. Subsequently, additional training of the shared backbone and individual subnets is conducted to enhance accuracy with the predefined MTS. Finally, both the shared backbone and each subnet undergo compression while maintaining the already secured accuracy and latency performance. The experimental results indicate that integrated accuracy performance is critically important in the configuration and optimization of MTL, and this integrated accuracy is determined by the ITC (inter-task correlation). The MDO algorithm was designed to consider these characteristics and construct multi-task sets with tasks that exhibit high ITC. Furthermore, the implementation of the proposed MDO algorithm, coupled with additional SSL (semi-supervised learning) based training, resulted in a significant performance enhancement. This advancement manifested as approximately a 12% increase in object detection mAP performance, a 15% improvement in lane detection accuracy, and a 27% reduction in latency, surpassing the results of previous three-task learning techniques like YOLOP and HybridNet.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10747906	PMC
http://dx.doi.org/10.3390/s23249729	DOI Listing

Publication Analysis

Top Keywords

image recognition

autonomous driving

recognition tasks

object detection

multi-task learning

accuracy

lane detection

hardware conditions

shared backbone

integrated accuracy

Similar Publications

Design of an intelligent disinfection control system based on an STM32 single-chip microprocessor by using the YOLO algorithm.

Sci Rep

December 2024

College of Mechanical and Electronic Engineering, Dalian Minzu University, Dalian, 116650, Liaoning, China.

Xueyi Wang Xianrong Li Haiying Du Jing Wang

The novel coronavirus (COVID-19) has affected more than two million people of the world, and far social distancing and segregated lifestyle have to be adopted as a common solution in recent years. To solve the problem of sanitation control and epidemic prevention in public places, in this paper, an intelligent disinfection control system based on the STM32 single-chip microprocessor was designed to realize intelligent closed-loop disinfection in local public places such as public toilets. The proposed system comprises seven modules: image acquisition, spraying control, disinfectant liquid level control, access control, voice broadcast, system display, and data storage.

View Article and Find Full Text PDF

Similar Publications

Optimizing VGG16 deep learning model with enhanced hunger games search for logo classification.

Sci Rep

December 2024

Department of Computer Science, Birzeit University, P.O. Box 14, Birzeit, West Bank, Palestine.

Mohammed Hussain Thaer Thaher Mohamed Basel Almourad Majdi Mafarja

Accurate classification of logos is a challenging task in image recognition due to variations in logo size, orientation, and background complexity. Deep learning models, such as VGG16, have demonstrated promising results in handling such tasks. However, their performance is highly dependent on optimal hyperparameter settings, whose fine-tuning is both labor-intensive and time-consuming.

View Article and Find Full Text PDF

Similar Publications

Sliding ferroelectric memories and synapses based on rhombohedral-stacked bilayer MoS.

Nat Commun

December 2024

Beijing National Laboratory for Condensed Matter Physics and Institute of Physics, Chinese Academy of Sciences, Beijing, China.

Xiuzhen Li Biao Qin Yaxian Wang Yue Xi Zhiheng Huang

Recent advances have uncovered an exotic sliding ferroelectric mechanism, which endows to design atomically thin ferroelectrics from non-ferroelectric parent monolayers. Although notable progress has been witnessed in understanding the fundamental properties, functional devices based on sliding ferroelectrics remain elusive. Here, we demonstrate the rewritable, non-volatile memories at room-temperature with a two-dimensional (2D) sliding ferroelectric semiconductor of rhombohedral-stacked bilayer MoS.

View Article and Find Full Text PDF

Similar Publications

A deep transfer learning based convolution neural network framework for air temperature classification using human clothing images.

Sci Rep

December 2024

Department of Electrical Engineering, College of Engineering, Taif University, P.O. BOX 11099, 21944, Taif, Saudi Arabia.

Maqsood Ahmed Xiang Zhang Yonglin Shen Nafees Ali Aymen Flah

Weather recognition is crucial due to its significant impact on various aspects of daily life, such as weather prediction, environmental monitoring, tourism, and energy production. Several studies have already conducted research on image-based weather recognition. However, previous studies have addressed few types of weather phenomena recognition from images with insufficient accuracy.

View Article and Find Full Text PDF

Similar Publications

FacialNet: facial emotion recognition for mental health analysis using UNet segmentation with transfer learning model.

Front Comput Neurosci

December 2024

Department of Information and Communication Engineering, Yeungnam University, Gyeongsan, Republic of Korea.

In-Seop Na Asma Aldrees Abeer Hakeem Linda Mohaisen Muhammad Umer

Facial emotion recognition (FER) can serve as a valuable tool for assessing emotional states, which are often linked to mental health. However, mental health encompasses a broad range of factors that go beyond facial expressions. While FER provides insights into certain aspects of emotional well-being, it can be used in conjunction with other assessments to form a more comprehensive understanding of an individual's mental health.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!