This article deals with the detection of prominent objects in images. As opposed to the standard approaches based on sliding windows, we study a fundamentally different solution by formulating the supervised prediction of a bounding box as an image retrieval task. Indeed, given a global image descriptor, we find the most similar images in an annotated dataset, and transfer the object bounding boxes. We refer to this approach as data-driven detection (DDD). Our key novelty is to design or learn image similarities that explicitly optimize some aspect of the transfer unlike previous work which uses generic representations and unsupervised similarities. In a first variant, we explicitly learn to transfer, by adapting a metric learning approach to work with image and bounding box pairs. Second, we use a representation of images as object probability maps computed from low-level patch classifiers. Experiments show that these two contributions yield in some cases comparable or better results than standard sliding window detectors - despite its conceptual simplicity and run-time efficiency. Our third contribution is an application of prominent object detection, where we improve fine-grained categorization by pre-cropping images with the proposed approach. Finally, we also extend the proposed approach to detect multiple parts of rigid objects.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TPAMI.2015.2509988 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!