Existing localization methods commonly employ vision to perceive scene and achieve localization in GNSS-denied areas, yet they often struggle in environments with complex lighting conditions, dynamic objects or privacy-preserving areas. Humans possess the ability to describe various scenes using natural language to help others infer the location by recognizing or recalling the rich semantic information in these descriptions. Harnessing language presents a potential solution for robust localization. Thus, this study introduces a new task, Language-driven Localization, and proposes a novel localization framework, LangLoc, which determines the user's position and orientation through textual descriptions. Given the diversity of natural language descriptions, we first design a Spatial Description Generator (SDG), foundational to LangLoc, which extracts and combines the position and attribute information of objects within a scene to generate uniformly formatted textual descriptions. SDG eliminates the ambiguity of language, detailing the spatial layout and object relations of the scene, providing a reliable basis for localization. With generated descriptions, LangLoc effortlessly achieves language-only localization using text encoder and pose regressor. Furthermore, LangLoc can add one image to text input, achieving mutual optimization and feature adaptive fusion across modalities through two modality-specific encoders, cross-modal fusion, and multimodal joint learning strategies. This enhances the framework's capability to handle complex scenes, achieving more accurate localization. Extensive experiments on the Oxford RobotCar, 4-Seasons, and Virtual Gallery datasets demonstrate LangLoc's effectiveness in both language-only and visual-language localization across various outdoor and indoor scenarios. Notably, LangLoc achieves noticeable performance gains when using both text and image inputs in challenging conditions such as overexposure, low lighting, and occlusions, showcasing its superior robustness.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2025.3546853DOI Listing

Publication Analysis

Top Keywords

localization
10
language-driven localization
8
spatial description
8
natural language
8
textual descriptions
8
langloc
6
descriptions
5
langloc language-driven
4
localization formatted
4
formatted spatial
4

Similar Publications

Herein, a novel and simple electrospray (ES) printing technique was developed for the fabrication of ultrathin graphene layers with precisely controlled nanometer-scale thickness, where graphene oxide (GO) was electrosprayed on wafers and subsequently chemically reduced into reduced GO (rGO). Utilizing that technique, we prepared ultrathin rGO in-plane graphene field-effect transistor (GFET)-based biosensors coupled with a portable prototype measuring system for point-of-care detection of pathogens. We illustrate the use of such prepared GFETs to detect COVID-19, using the SARS-CoV-2 nucleocapsid protein antigen (N-protein) and genomic viral RNA as detection targets.

View Article and Find Full Text PDF

A novel pheophorbide derivative, trimethyl-152-[L-aspartyl]pheophorbide a was synthesised and investigated for anti-tumor activity. The prepared photosensitizer had good absorption in the phototherapeutic window and high ROS yields. It exhibited excellent phototoxicity higher than reference compound m-THPC when irradiated by 650 nm light in vitro, and obvious photodynamic anti-tumor effect in vivo.

View Article and Find Full Text PDF

The direct catalytic C-H functionalization of aromatic compounds such as anisoles and thioanisoles is of great interest and significance. However, achieving precise regioselectivity remains a major challenge. In this study, we conducted comprehensive density functional theory calculations to explore the mechanisms of rare-earth-catalyzed regioselective C-H alkylation, borylation, and silylation of anisole and thioanisole.

View Article and Find Full Text PDF

Pituitary neuroendocrine tumors (PitNETS) are common intracranial tumors, but extrasellar or ectopic PitNETS are very rare and supposed to originate from some pituitary remnants. They are mostly found in sphenoidal sinus. But particularly, ectopic clival PitNETS are highly aggressive and can cause bone invasion and can be misdiagnosed as other lesions of the skull base such as chordomas.

View Article and Find Full Text PDF

Positive surgical margins following radical prostatectomy significantly contribute to tumor recurrence. While systemic chemotherapy demonstrates limited efficacy in this context, local chemotherapy drug delivery systems based on nanomaterials offer promising strategies to address this issue by modifying drug release kinetics and distribution, thereby enhancing antitumor effects while minimizing the toxicities associated with systemic chemotherapy. In this study, we utilized electrospun nanofibrous mats loaded with docetaxel for sustained drug delivery.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!