Channel Exchanging for RGB-T Tracking.

Sensors (Basel)

College of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China.

Published: August 2021

It is difficult to achieve all-weather visual object tracking in an open environment only utilizing single modality data input. Due to the complementarity of RGB and thermal infrared (TIR) data in various complex environments, a more robust object tracking framework can be obtained using video data of these two modalities. The fusion methods of RGB and TIR data are the core elements to determine the performance of the RGB-T object tracking method, and the existing RGB-T trackers have not solved this problem well. In order to solve the current low utilization of information intra single modality in aggregation-based methods and between two modalities in alignment-based methods, we used DiMP as the baseline tracker to design an RGB-T object tracking framework channel exchanging DiMP (CEDiMP) based on channel exchanging. CEDiMP achieves dynamic channel exchanging between sub-networks of different modes hardly adding any parameters during the feature fusion process. The expression ability of the deep features generated by our data fusion method based on channel exchanging is stronger. At the same time, in order to solve the poor generalization ability of the existing RGB-T object tracking methods and the poor ability in the long-term object tracking, more training of CEDiMP on the synthetic dataset LaSOT-RGBT is added. A large number of experiments demonstrate the effectiveness of the proposed model. CEDiMP achieves the best performance on two RGB-T object tracking benchmark datasets, GTOT and RGBT234, and performs outstandingly in the generalization testing.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8434326PMC
http://dx.doi.org/10.3390/s21175800DOI Listing

Publication Analysis

Top Keywords

object tracking
28
channel exchanging
20
rgb-t object
16
tracking
8
single modality
8
tir data
8
tracking framework
8
performance rgb-t
8
existing rgb-t
8
order solve
8

Similar Publications

Real-time motion trajectory training and prediction using reservoir computing for intelligent sensing equipment.

Rev Sci Instrum

January 2025

Shanxi Key Laboratory of Intelligent Detection Technology and Equipment, School of Information and Communication Engineering, North University of China, Taiyuan 030051, Shanxi, China.

Real-time moving target trajectory prediction is highly valuable in applications such as automatic driving, target tracking, and motion prediction. This paper examines the projection of three-dimensional random motion of an object in space onto a sensing plane as an illustrative example. Historical running trajectory data are used to train a reserve network.

View Article and Find Full Text PDF

East Asians are more likely than North Americans to attend to visual scenes holistically, focusing on the relations between objects and their background rather than isolating components. This cultural difference in context sensitivity-greater attentional allocation to the background of an image or scene-has been attributed to socialization, yet it is unknown how early in development it appears, and whether it is moderated by social information. We employed eye-tracking to investigate context-sensitivity in 15-month-olds in Japan (n = 45) and the United States (n = 52).

View Article and Find Full Text PDF

This paper proposes a new strategy for analysing and detecting abnormal passenger behavior and abnormal objects on buses. First, a library of abnormal passenger behaviors and objects on buses is established. Then, a new mask detection and abnormal object detection and analysis (MD-AODA) algorithm is proposed.

View Article and Find Full Text PDF

The Segment Anything model (SAM) is a powerful vision foundation model that is revolutionizing the traditional paradigm of segmentation. Despite this, a reliance on prompting each frame and large computational cost limit its usage in robotically assisted surgery. Applications, such as augmented reality guidance, require little user intervention along with efficient inference to be usable clinically.

View Article and Find Full Text PDF

MEVDT: Multi-modal event-based vehicle detection and tracking dataset.

Data Brief

February 2025

Department of Electrical and Computer Engineering, University of Michigan-Dearborn, 4901 Evergreen Rd, Dearborn, 48128 MI, USA.

In this data article, we introduce the Multi-Modal Event-based Vehicle Detection and Tracking (MEVDT) dataset. This dataset provides a synchronized stream of event data and grayscale images of traffic scenes, captured using the Dynamic and Active-Pixel Vision Sensor (DAVIS) 240c hybrid event-based camera. MEVDT comprises 63 multi-modal sequences with approximately 13k images, 5M events, 10k object labels, and 85 unique object tracking trajectories.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!