Reading between the lines: object localization using implicit cues from image tags.

IEEE Trans Pattern Anal Mach Intell

Department of Computer Science, University of Texas at Austin, 1616 Guadalupe, Suite 2.408, Austin, TX 78701, USA.

Published: June 2012

Current uses of tagged images typically exploit only the most explicit information: the link between the nouns named and the objects present somewhere in the image. We propose to leverage "unspoken" cues that rest within an ordered list of image tags so as to improve object localization. We define three novel implicit features from an image's tags-the relative prominence of each object as signified by its order of mention, the scale constraints implied by unnamed objects, and the loose spatial links hinted at by the proximity of names on the list. By learning a conditional density over the localization parameters (position and scale) given these cues, we show how to improve both accuracy and efficiency when detecting the tagged objects. Furthermore, we show how the localization density can be learned in a semantic space shared by the visual and tag-based features, which makes the technique applicable for detection in untagged input images. We validate our approach on the PASCAL VOC, LabelMe, and Flickr image data sets, and demonstrate its effectiveness relative to both traditional sliding windows as well as a visual context baseline. Our algorithm improves state-of-the-art methods, successfully translating insights about human viewing behavior (such as attention, perceived importance, or gaze) into enhanced object detection.

Download full-text PDF	Source
http://dx.doi.org/10.1109/TPAMI.2011.190	DOI Listing

Publication Analysis

Top Keywords

object localization

image tags

reading lines

object

lines object

localization

localization implicit

implicit cues

image

cues image

Similar Publications

Clinical Evaluation of ODIS-1 Orthodontic Operation and Image Quality of Digital Imaging System.

Curr Med Imaging

March 2025

Stomatological Hospital of Chongqing Medical University, Chongqing401147, China.

Yuanli Zhang Hong Huang Chongzhi Yin Guizhi Zhang Yang Wang

Background: With the rapid development of computer technology, the application of digital technology to the display and processing of medical images has become a common concern. In recent years, oral digital imaging technology has received more and more attention.

Objective: This paper mainly aims at the ODIS-1 oral digital imaging system to analyze and study the image quality and image aims at the ODIS-1 oral digital imaging system to analyze and study the image quality and processing technology, of which X-ray imaging is indispensable.

View Article and Find Full Text PDF

Similar Publications

Semantic SLAM system for mobile robots based on large visual model in complex environments.

Sci Rep

March 2025

School of mechatronics engineering, Harbin Institute of Technology, Harbin, 150001, Harbin, People's Republic of China.

Chao Zheng Peng Zhang Yanan Li

Simultaneous localization and mapping (SLAM) plays an important role in many fields, one of which is to help unmanned devices such as drones, self-driving cars and intelligent robots to achieve precise positioning and mapping. However, when facing complex or changing surroundings, especially when healthcare robots face a large number of mobile healthcare workers and patients in wards, the hospital environment is relatively complex, and the traditional positioning and mapping methods based on geometric features, such as points and lines, are not able to achieve accurate positioning and mapping results for healthcare robots. This paper mainly focuses on the characteristics of complex dynamic environment, and proposes a method to obtain semantic information of surrounding ring and dynamic point culling strategy for robot localisation and mapping.

View Article and Find Full Text PDF

Similar Publications

LangLoc: Language-Driven Localization via Formatted Spatial Description Generation.

IEEE Trans Image Process

March 2025

Weimin Shi Changhao Chen Kaige Li Yuan Xiong Xiaochun Cao

Existing localization methods commonly employ vision to perceive scene and achieve localization in GNSS-denied areas, yet they often struggle in environments with complex lighting conditions, dynamic objects or privacy-preserving areas. Humans possess the ability to describe various scenes using natural language to help others infer the location by recognizing or recalling the rich semantic information in these descriptions. Harnessing language presents a potential solution for robust localization.

View Article and Find Full Text PDF

Similar Publications

Editable Mesh Animations Modeling Based on Controlable Particles for Real-Time XR.

IEEE Trans Vis Comput Graph

March 2025

Xiangyang Zhou Yanrui Xu Chao Yao Xiaokun Wang Xiaojuan Ban

The real-time generation of editable mesh animations in XR applications has been a focal point of research in the XR field. However, easily controlling the generated editable meshes remains a significant challenge. Existing methods often suffer from slow generation speeds and suboptimal results, failing to accurately simulate target objects' complex details and shapes, which does not meet user expectations.

View Article and Find Full Text PDF

Similar Publications

Developing and testing preliminary effectiveness and acceptability of a mental health application in Ethiopia.

Discov Ment Health

March 2025

Department of Information Science, College of Informatics, University of Gondar, Gondar, Ethiopia.

Endalamaw Salelew Mengesha Srahbzu Tsehay Diges Habtamu Genet Yeniakal Gebeyehu Begashaw

Background: Mental health mobile applications are a cost-effective and scalable answer to the world's psychiatrist shortage and limited access to care in remote areas. However, there is currently no mobile application for providing mental health interventions in Ethiopia. Therefore, this project aimed to develop and test the preliminary effectiveness and acceptability of an Android-based mobile application for mental health information, psychological self-testing, and treatment recommendation during COVID-19 and beyond.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!