IV-YOLO: A Lightweight Dual-Branch Object Detection Network.

Dan Tian Xin Yan Dong Zhou Chen Wang Wenshuai Zhang

Sensors (Basel)

Institute of Electronic Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China.

Published: September 2024

With the rapid growth in demand for security surveillance, assisted driving, and remote sensing, object detection networks with robust environmental perception and high detection accuracy have become a research focus. However, single-modality image detection technologies face limitations in environmental adaptability, often affected by factors such as lighting conditions, fog, rain, and obstacles like vegetation, leading to information loss and reduced detection accuracy. We propose an object detection network that integrates features from visible light and infrared images-IV-YOLO-to address these challenges. This network is based on YOLOv8 (You Only Look Once v8) and employs a dual-branch fusion structure that leverages the complementary features of infrared and visible light images for target detection. We designed a Bidirectional Pyramid Feature Fusion structure (Bi-Fusion) to effectively integrate multimodal features, reducing errors from feature redundancy and extracting fine-grained features for small object detection. Additionally, we developed a Shuffle-SPP structure that combines channel and spatial attention to enhance the focus on deep features and extract richer information through upsampling. Regarding model optimization, we designed a loss function tailored for multi-scale object detection, accelerating the convergence speed of the network during training. Compared with the current state-of-the-art Dual-YOLO model, IV-YOLO achieves mAP improvements of 2.8%, 1.1%, and 2.2% on the Drone Vehicle, FLIR, and KAIST datasets, respectively. On the Drone Vehicle and FLIR datasets, IV-YOLO has a parameter count of 4.31 M and achieves a frame rate of 203.2 fps, significantly outperforming YOLOv8n (5.92 M parameters, 188.6 fps on the Drone Vehicle dataset) and YOLO-FIR (7.1 M parameters, 83.3 fps on the FLIR dataset), which had previously achieved the best performance on these datasets. This demonstrates that IV-YOLO achieves higher real-time detection performance while maintaining lower parameter complexity, making it highly promising for applications in autonomous driving, public safety, and beyond.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11478367	PMC
http://dx.doi.org/10.3390/s24196181	DOI Listing

Publication Analysis

Top Keywords

object detection

drone vehicle

detection

detection network

detection accuracy

visible light

fusion structure

iv-yolo achieves

vehicle flir

object

Similar Publications

A small object detection model in aerial images based on CPDD-YOLOv8.

Sci Rep

January 2025

School of Cyberspace Security, Hebei University of Engineering Science, Shijiazhuang, 050091, China.

Jingyang Wang Jiayao Gao Bo Zhang

Aerial images can cover a wide area and capture rich scene information. These images are often taken from a high altitude and contain many small objects. It is difficult to detect small objects accurately because their features are not obvious and are susceptible to background interference.

View Article and Find Full Text PDF

Similar Publications

Pointer meters recognition method in the wild based on innovative deep learning techniques.

Sci Rep

January 2025

College of Computer and Data Science, Minjiang University, Fuzhou, 350018, China.

Jiajun Feng Haibo Luo Rui Ming

This study presents a novel approach to identifying meters and their pointers in modern industrial scenarios using deep learning. We developed a neural network model that can detect gauges and one or more of their pointers on low-quality images. We use an encoder network, jump connections, and a modified Convolutional Block Attention Module (CBAM) to detect gauge panels and pointer keypoints in images.

View Article and Find Full Text PDF

Similar Publications

Authentication of glass beads from Cultural Heritage: An interdisciplinary and multi-analytical approach.

Talanta

January 2025

Instituto de Historia (IH-CCHS), CSIC, C/ Albasanz 26-28, 28037, Madrid, Spain. Electronic address:

L Maestro-Guijarro A Pinilla P M Carmona-Quiroga F Agua M Castillejo

Analysis of glass-based artworks is important for authentication purposes. In recent years, there have been rapid advancements and improvements in the characterization of glass objects using different analytical approaches. The present study presents an interdisciplinary and multi-analytical authentication approach that provides useful tools and markers to unmask possible imitations, counterfeiting, and forgeries in Cultural Heritage glass beads by comparing the composition of historical and modern glass beads.

View Article and Find Full Text PDF

Similar Publications

MPIC: Exploring alternative approach to standard convolution in deep neural networks.

Neural Netw

December 2024

Institute of Automation, Chinese Academy of Sciences, MAIS, Beijing, 100190, China; University of Chinese Academy of Sciences, Beijing, 101408, China.

Jie Jiang Yi Zhong Ruoli Yang Weize Quan Dong-Ming Yan

In the rapidly evolving field of deep learning, Convolutional Neural Networks (CNNs) retain their unique strengths and applicability in processing grid-structured data such as images, despite the surge of Transformer architectures. This paper explores alternatives to the standard convolution, with the objective of augmenting its feature extraction prowess while maintaining a similar parameter count. We propose innovative solutions targeting depthwise separable convolution and standard convolution, culminating in our Multi-scale Progressive Inference Convolution (MPIC).

View Article and Find Full Text PDF

Similar Publications

Effect of zoledronic acid on biological characteristics of cervical cancer cells.

Afr J Reprod Health

November 2024

Department of Obstetrics and Gynecology, Wuxi No.2 People's Hospital, Wuxi 214002, Jiangsu Province, China.

Ling Qin Xuqun Ding

Cervical cancer (CC) is a malignant tumor in females characterized by high incidence and mortality rates, often resulting in a poor prognosis for patients. Zoledronic acid (ZA), a third-generation bisphosphonate, exhibits anti-tumor properties across various types of tumors. To further understand the effect of ZA in the treatment of CC, this article included two kinds of human CC cells (CCCs) as the research object, examining the impact of varying levels of ZA on the cells' biological properties.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!