Visual Tracking Using Sparse Coding and Earth Mover's Distance.

Front Robot AI

Department of Electrical and Computer Engineering, University of Connecticut Storrs, CT, United States.

Published: August 2018

An efficient iterative Earth Mover's Distance (iEMD) algorithm for visual tracking is proposed in this paper. The Earth Mover's Distance (EMD) is used as the similarity measure to search for the optimal template candidates in feature-spatial space in a video sequence. The local sparse representation is used as the appearance model for the iEMD tracker. The maximum-alignment-pooling method is used for constructing a sparse coding histogram which reduces the computational complexity of the EMD optimization. The template update algorithm based on the EMD is also presented. When the camera is mounted on a moving robot, e.g., a flying quadcopter, the camera could experience a sudden and rapid motion leading to large inter-frame movements. To ensure that the tracking algorithm converges, a gyro-aided extension of the iEMD tracker is presented, where synchronized gyroscope information is utilized to compensate for the rotation of the camera. The iEMD algorithm's performance is evaluated using eight publicly available videos from the CVPR 2013 dataset. The performance of the iEMD algorithm is compared with eight state-of-the-art tracking algorithms based on relative percentage overlap. Experimental results show that the iEMD algorithm performs robustly in the presence of illumination variation and deformation. The robustness of this algorithm for large inter-frame displacements is also illustrated.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7805675PMC
http://dx.doi.org/10.3389/frobt.2018.00095DOI Listing

Publication Analysis

Top Keywords

earth mover's
12
mover's distance
12
iemd algorithm
12
visual tracking
8
sparse coding
8
iemd tracker
8
large inter-frame
8
iemd
6
algorithm
6
tracking sparse
4

Similar Publications

Infrared array sensor-based fall detection and activity recognition systems have gained momentum as promising solutions for enhancing healthcare monitoring and safety in various environments. Unlike camera-based systems, which can be privacy-intrusive, IR array sensors offer a non-invasive, reliable approach for fall detection and activity recognition while preserving privacy. This work proposes a novel method to distinguish between normal motion and fall incidents by analyzing thermal patterns captured by infrared array sensors.

View Article and Find Full Text PDF

Objective: Functional magnetic resonance imaging data pose significant challenges due to their inherently noisy and complex nature, making traditional statistical models less effective in capturing predictive features. While deep learning models offer superior performance through their non-linear capabilities, they often lack transparency, reducing trust in their predictions. This study introduces the Time Reversal (TR) pretraining method to address these challenges.

View Article and Find Full Text PDF

Grade Inflation in Generative Models.

ArXiv

January 2025

Department of Pathology and the Division of Clinical Informatics, Department of Medicine, BIDMC and with Harvard Medical School, Boston, MA 02215.

Generative models hold great potential, but only if one can trust the evaluation of the data they generate. We show that many commonly used quality scores for comparing two-dimensional distributions of synthetic vs. ground-truth data give better results than they should, a phenomenon we call the "grade inflation problem.

View Article and Find Full Text PDF

Introduction: Musical instrument recognition is a critical component of music information retrieval (MIR), aimed at identifying and classifying instruments from audio recordings. This task poses significant challenges due to the complexity and variability of musical signals.

Methods: In this study, we employed convolutional neural networks (CNNs) to analyze the contributions of various spectrogram representations-STFT, Log-Mel, MFCC, Chroma, Spectral Contrast, and Tonnetz-to the classification of ten different musical instruments.

View Article and Find Full Text PDF

Clustering Cu-S based compounds using periodic table representation and compositional Wasserstein distance.

Sci Rep

December 2024

Key Laboratory of Computing Power Network and Information Security, Shandong Computer Science Center (National Supercomputing Center in Jinan), Ministry of Education, Qilu University of Technology (Shandong Academy of Sciences), Jinan, 250013, Shandong, P. R. China.

Crystal structure similarity is useful for the chemical analysis of nowadays big materials databases and data mining new materials. Here we propose to use two-dimensional Wasserstein distance (earth mover's distance) to measure the compositional similarity between different compounds, based on the periodic table representation of compositions. To demonstrate the effectiveness of our approach, 1586 Cu-S based compounds are taken from the inorganic crystal structure database (ICSD) to form a validation dataset.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!