IV-YOLO: A Lightweight Dual-Branch Object Detection Network.

Sensors (Basel)

Institute of Electronic Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China.

Published: September 2024

With the rapid growth in demand for security surveillance, assisted driving, and remote sensing, object detection networks with robust environmental perception and high detection accuracy have become a research focus. However, single-modality image detection technologies face limitations in environmental adaptability, often affected by factors such as lighting conditions, fog, rain, and obstacles like vegetation, leading to information loss and reduced detection accuracy. We propose an object detection network that integrates features from visible light and infrared images-IV-YOLO-to address these challenges. This network is based on YOLOv8 (You Only Look Once v8) and employs a dual-branch fusion structure that leverages the complementary features of infrared and visible light images for target detection. We designed a Bidirectional Pyramid Feature Fusion structure (Bi-Fusion) to effectively integrate multimodal features, reducing errors from feature redundancy and extracting fine-grained features for small object detection. Additionally, we developed a Shuffle-SPP structure that combines channel and spatial attention to enhance the focus on deep features and extract richer information through upsampling. Regarding model optimization, we designed a loss function tailored for multi-scale object detection, accelerating the convergence speed of the network during training. Compared with the current state-of-the-art Dual-YOLO model, IV-YOLO achieves mAP improvements of 2.8%, 1.1%, and 2.2% on the Drone Vehicle, FLIR, and KAIST datasets, respectively. On the Drone Vehicle and FLIR datasets, IV-YOLO has a parameter count of 4.31 M and achieves a frame rate of 203.2 fps, significantly outperforming YOLOv8n (5.92 M parameters, 188.6 fps on the Drone Vehicle dataset) and YOLO-FIR (7.1 M parameters, 83.3 fps on the FLIR dataset), which had previously achieved the best performance on these datasets. This demonstrates that IV-YOLO achieves higher real-time detection performance while maintaining lower parameter complexity, making it highly promising for applications in autonomous driving, public safety, and beyond.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11478367PMC
http://dx.doi.org/10.3390/s24196181DOI Listing

Publication Analysis

Top Keywords

object detection
20
drone vehicle
12
detection
10
detection network
8
detection accuracy
8
visible light
8
fusion structure
8
iv-yolo achieves
8
vehicle flir
8
object
5

Similar Publications

Aerial images can cover a wide area and capture rich scene information. These images are often taken from a high altitude and contain many small objects. It is difficult to detect small objects accurately because their features are not obvious and are susceptible to background interference.

View Article and Find Full Text PDF

This study presents a novel approach to identifying meters and their pointers in modern industrial scenarios using deep learning. We developed a neural network model that can detect gauges and one or more of their pointers on low-quality images. We use an encoder network, jump connections, and a modified Convolutional Block Attention Module (CBAM) to detect gauge panels and pointer keypoints in images.

View Article and Find Full Text PDF

Authentication of glass beads from Cultural Heritage: An interdisciplinary and multi-analytical approach.

Talanta

January 2025

Instituto de Historia (IH-CCHS), CSIC, C/ Albasanz 26-28, 28037, Madrid, Spain. Electronic address:

Analysis of glass-based artworks is important for authentication purposes. In recent years, there have been rapid advancements and improvements in the characterization of glass objects using different analytical approaches. The present study presents an interdisciplinary and multi-analytical authentication approach that provides useful tools and markers to unmask possible imitations, counterfeiting, and forgeries in Cultural Heritage glass beads by comparing the composition of historical and modern glass beads.

View Article and Find Full Text PDF

MPIC: Exploring alternative approach to standard convolution in deep neural networks.

Neural Netw

December 2024

Institute of Automation, Chinese Academy of Sciences, MAIS, Beijing, 100190, China; University of Chinese Academy of Sciences, Beijing, 101408, China.

In the rapidly evolving field of deep learning, Convolutional Neural Networks (CNNs) retain their unique strengths and applicability in processing grid-structured data such as images, despite the surge of Transformer architectures. This paper explores alternatives to the standard convolution, with the objective of augmenting its feature extraction prowess while maintaining a similar parameter count. We propose innovative solutions targeting depthwise separable convolution and standard convolution, culminating in our Multi-scale Progressive Inference Convolution (MPIC).

View Article and Find Full Text PDF

Effect of zoledronic acid on biological characteristics of cervical cancer cells.

Afr J Reprod Health

November 2024

Department of Obstetrics and Gynecology, Wuxi No.2 People's Hospital, Wuxi 214002, Jiangsu Province, China.

Cervical cancer (CC) is a malignant tumor in females characterized by high incidence and mortality rates, often resulting in a poor prognosis for patients. Zoledronic acid (ZA), a third-generation bisphosphonate, exhibits anti-tumor properties across various types of tumors. To further understand the effect of ZA in the treatment of CC, this article included two kinds of human CC cells (CCCs) as the research object, examining the impact of varying levels of ZA on the cells' biological properties.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!