This paper describes compact image descriptors enabling accurate object categorization with linear classification models, which offer the advantage of being efficient to both train and test. The shared property of our descriptors is the use of classifiers to produce the features of each image. Intuitively, these classifiers evaluate the presence of a set of basis classes inside the image. We first propose to train the basis classifiers as recognizers of a hand-selected set of object classes. We then demonstrate that better accuracy can be achieved by learning the basis classes as "abstract categories" collectively optimized as features for linear classification. Finally, we describe several strategies to aggregate the outputs of basis classifiers evaluated on multiple subwindows of the image in order to handle cases when the photo contains multiple objects and large amounts of clutter. We test our descriptors on challenging benchmarks of object categorization and detection, using a simple linear SVM as classifier. Our results are on par with those achieved by the best systems in these fields but are produced at orders of magnitude lower computational costs and using an image representation that is general and not specifically tuned for a predefined set of test classes.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TPAMI.2014.2313111 | DOI Listing |
Sensors (Basel)
January 2025
NUS-ISS, National University of Singapore, Singapore 119615, Singapore.
Recognizing the action of plastic bag taking from CCTV video footage represents a highly specialized and niche challenge within the broader domain of action video classification. To address this challenge, our paper introduces a novel benchmark video dataset specifically curated for the task of identifying the action of grabbing a plastic bag. Additionally, we propose and evaluate three distinct baseline approaches.
View Article and Find Full Text PDFSensors (Basel)
January 2025
School of Mechanical and Electrical Engineering, China University of Mining and Technology (Beijing), Beijing 100083, China.
Unsupervised Domain Adaptation for Object Detection (UDA-OD) aims to adapt a model trained on a labeled source domain to an unlabeled target domain, addressing challenges posed by domain shifts. However, existing methods often face significant challenges, particularly in detecting small objects and over-relying on classification confidence for pseudo-label selection, which often leads to inaccurate bounding box localization. To address these issues, we propose a novel UDA-OD framework that leverages scale consistency (SC) and Temporal Ensemble Pseudo-Label Selection (TEPLS) to enhance cross-domain robustness and detection performance.
View Article and Find Full Text PDFSensors (Basel)
January 2025
Departamento de Geografía, Facultad de Ciencias, Universidad de la República, Montevideo 4225, Uruguay.
Recent advancements in Earth Observation sensors, improved accessibility to imagery and the development of corresponding processing tools have significantly empowered researchers to extract insights from Multisource Remote Sensing. This study aims to use these technologies for mapping summer and winter Land Use/Land Cover features in Cuenca de la Laguna Merín, Uruguay, while comparing the performance of Random Forests, Support Vector Machines, and Gradient-Boosting Tree classifiers. The materials include Sentinel-2, Sentinel-1 and Shuttle Radar Topography Mission imagery, Google Earth Engine, training and validation datasets and quoted classifiers.
View Article and Find Full Text PDFSensors (Basel)
January 2025
Cognitive Systems Lab, University of Bremen, 28359 Bremen, Germany.
Over recent years, automated Human Activity Recognition (HAR) has been an area of concern for many researchers due to its widespread application in surveillance systems, healthcare environments, and many more. This has led researchers to develop coherent and robust systems that efficiently perform HAR. Although there have been many efficient systems developed to date, still, there are many issues to be addressed.
View Article and Find Full Text PDFSensors (Basel)
December 2024
Department of Core Informatics, Graduate School of Informatics, Osaka Metropolitan University, Osaka 558-8585, Japan.
Recently, the application of deep neural networks to detect anomalies on medical images has been facing the appearance of noisy labels, including overlapping objects and similar classes. Therefore, this study aims to address this challenge by proposing a unique attention module that can assist deep neural networks in focusing on important object features in noisy medical image conditions. This module integrates global context modeling to create long-range dependencies and local interactions to enable channel attention ability by using 1D convolution that not only performs well with noisy labels but also consumes significantly less resources without any dimensionality reduction.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!