We present an efficient foveal framework to perform object detection. A scale normalized image pyramid (SNIP) is generated that, like human vision, only attends to objects within a fixed size range at different scales. Such a restriction of objects' size during training affords better learning of object-sensitive filters, and therefore, results in better accuracy. However, the use of an image pyramid increases the computational cost. Hence, we propose an efficient spatial sub-sampling scheme which only operates on fixed-size sub-regions likely to contain objects (as object locations are known during training). The resulting approach, referred to as Scale Normalized Image Pyramid with Efficient Resampling or SNIPER, yields up to 3× speed-up during training. Unfortunately, as object locations are unknown during inference, the entire image pyramid still needs processing. To this end, we adopt a coarse-to-fine approach, and predict the locations and extent of object-like regions which will be processed in successive scales of the image pyramid. Intuitively, it's akin to our active human-vision that first skims over the field-of-view to spot interesting regions for further processing and only recognizes objects at the right resolution. The resulting algorithm is referred to as AutoFocus and results in a 2.5-5× speed-up during inference when used with SNIP. Code: https://github.com/mahyarnajibi/SNIPER.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2021.3058945DOI Listing

Publication Analysis

Top Keywords

image pyramid
20
scale normalized
12
normalized image
12
object detection
8
object locations
8
image
6
pyramid
5
image pyramids
4
pyramids autofocus
4
object
4

Similar Publications

Fine-grained restoration of Mongolian patterns based on a multi-stage deep learning network.

Sci Rep

December 2024

College of Computer and Information Engineering, Inner Mongolia Agricultural University, Huhhot, 010000, Inner Mongolia, China.

Mongolian patterns are easily damaged by various factors in the process of inheritance and preservation, and the traditional manual restoration methods are time-consuming, laborious, and costly. With the development of deep learning technology and the rapid growth of the image restoration field, the existing image restoration methods are mostly aimed at natural scene images. They do not apply to Mongolian patterns with complex line texture structures and high saturation-rich colors.

View Article and Find Full Text PDF

Innovative modified-net architecture: enhanced segmentation of deep vein thrombosis.

Sci Rep

December 2024

School of Electronics Engineering, Vellore Institute of Technology, Vellore, 632014, Tamilnadu, India.

A new era for diagnosing and treating Deep Vein Thrombosis (DVT) relies on precise segmentation from medical images. Our research introduces a novel algorithm, the Modified-Net architecture, which integrates a broad spectrum of architectural components tailored to detect the intricate patterns and variances in DVT imaging data. Our work integrates advanced components such as dilated convolutions for larger receptive fields, spatial pyramid pooling for context, residual and inception blocks for multiscale feature extraction, and attention mechanisms for highlighting key features.

View Article and Find Full Text PDF

Walnuts possess significant nutritional and economic value. Fast and accurate sorting of shells and kernels will enhance the efficiency of automated production. Therefore, we propose a FastQAFPN-YOLOv8s object detection network to achieve rapid and precise detection of unsorted materials.

View Article and Find Full Text PDF

In the maritime environment, the instance segmentation of small ships is crucial. Small ships are characterized by their limited appearance, smaller size, and ships in distant locations in marine scenes. However, existing instance segmentation algorithms do not detect and segment them, resulting in inaccurate ship segmentation.

View Article and Find Full Text PDF

The development of a waste management and classification system based on deep learning and Internet of Things.

Environ Monit Assess

December 2024

Chongqing Key Laboratory of Non-Linear Circuit and Intelligent Information Processing, College of Electronic and Information Engineering, Southwest University, Chongqing, 400715, China.

Waste sorting is a key part of sustainable development. To maximize the recovery of resources and reduce labor costs, a waste management and classification system is established. In the system, we use Internet of Things (IoT) and edge computing to implement waste sorting and the systematic long-distance information transmission and monitoring.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!