Visual Navigation With Multiple Goals Based on Deep Reinforcement Learning.

Zhenhuan Rao Yuechen Wu Zifei Yang Wei Zhang Shijian Lu Weizhi Lu ZhengJun Zha

IEEE Trans Neural Netw Learn Syst

Published: December 2021

Learning to adapt to different goals in visual navigation is difficult, so we propose a model-embedded actor-critic architecture for multigoal navigation.
We introduce two new designs: the inverse dynamics model (InvDM) to connect states with goals and improve learning efficiency, and multigoal colearning (MgCl) to enhance sample efficiency and utilize unintended positive experiences.
Our enhanced navigation model includes two self-supervised tasks: path closed-loop detection to recognize experienced states, and state-target matching to assess state-goal differences, resulting in faster convergence and better generalization than existing methods.

Learning to adapt to a series of different goals in visual navigation is challenging. In this work, we present a model-embedded actor-critic architecture for the multigoal visual navigation task. To enhance the task cooperation in multigoal learning, we introduce two new designs to the reinforcement learning scheme: inverse dynamics model (InvDM) and multigoal colearning (MgCl). Specifically, InvDM is proposed to capture the navigation-relevant association between state and goal and provide additional training signals to relieve the sparse reward issue. MgCl aims at improving the sample efficiency and supports the agent to learn from unintentional positive experiences. Besides, to further improve the scene generalization capability of the agent, we present an enhanced navigation model that consists of two self-supervised auxiliary task modules. The first module, which is named path closed-loop detection, helps to understand whether the state has been experienced. The second one, namely the state-target matching module, tries to figure out the difference between state and goal. Extensive results on the interactive platform AI2-THOR demonstrate that the agent trained with the proposed method converges faster than state-of-the-art methods while owning good generalization capability. The video demonstration is available at https://vsislab.github.io/mgvn.

Download full-text PDF	Source
http://dx.doi.org/10.1109/TNNLS.2021.3057424	DOI Listing

Publication Analysis

Top Keywords

visual navigation

reinforcement learning

state goal

generalization capability

navigation multiple

multiple goals

goals based

based deep

deep reinforcement

learning

Similar Publications

The development of a wearable goggle echolocation device to support people who are visually impaired with unhindered and unaided movement.

J Rehabil Assist Technol Eng

January 2025

Department of Electrical and Electronics Engineering, Faculty of Engineering, University of Lagos, Lagos, Nigeria.

Uzoma Ifeanyi Oduah Oluwasegun Adewumi Agbakansi Uche Kingsley Daniel Oluwole

The mobility of people with severe visual impairment is limited affecting their comfort and productivity. There are about 45 million people who are blind with global financial burden and annual global cost of productivity estimated to be USD411 billion according to World Health Organization report of 2024. The contributions of the people who are visually impaired to the gross domestic product (GDP) can be enhanced deploying technology.

View Article and Find Full Text PDF

Similar Publications

PALMS: Plane-based Accessible Indoor Localization Using Mobile Smartphones.

Int Conf Indoor Position Indoor Navig

October 2024

Computer Science and Engineering, University of California, Santa Cruz, Santa Cruz, United States.

Yunqian Cheng Roberto Manduchi

In this paper, we present PALMS, an innovative indoor global localization and relocalization system for mobile smartphones that utilizes publicly available floor plans. Unlike most vision-based methods that require constant visual input, our system adopts a dynamic form of localization that considers a single instantaneous observation and odometry data. The core contribution of this work is the introduction of a particle filter initialization method that leverages the Certainly Empty Space (CES) constraint along with principal orientation matching.

View Article and Find Full Text PDF

Similar Publications

Robust Indoor Pedestrian Backtracking Using Magnetic Signatures and Inertial Data.

Int Conf Indoor Position Indoor Navig

October 2024

Department of Computer Science & Engineering, University of California, Santa Cruz, Santa Cruz, USA.

Chia Hsuan Tsai Roberto Manduchi

Navigating unfamiliar environments can be challenging for visually impaired individuals due to difficulties in recognizing distant landmarks or visual cues. This work focuses on a particular form of wayfinding, specifically backtracking a previously taken path, which can be useful for blind pedestrians. We propose a hands-free indoor navigation solution using a smartphone without relying on pre-existing maps or external infrastructure.

View Article and Find Full Text PDF

Similar Publications

Visuospatial computations vary by category and stream and continue to develop in adolescence.

bioRxiv

January 2025

Jewelia K Yao Justin Choo Dawn Finzi Kalanit Grill-Spector

Reading, face recognition, and navigation are supported by visuospatial computations in category-selective regions across ventral, lateral, and dorsal visual streams. However, the nature of visuospatial computations across streams and their development in adolescence remain unknown. Using fMRI and population receptive field (pRF) modeling in adolescents and adults, we estimate pRFs in high-level visual cortex and determine their development.

View Article and Find Full Text PDF

Similar Publications

Contemporary clinical perspectives on chronic low back pain: The biology, mechanics, etc. underpinning clinical and radiological evaluation.

JOR Spine

March 2025

Spine Labs St George and Sutherland Clinical School, University of New South Wales Kogarah New South Wales Australia.

Stone Sima Ashish Diwan

Background: Pain of a chronic nature remains the foremost concern in tertiary spine clinics, yet its elusive nature and quantification challenges persist. Despite extensive research and education on low back pain (LBP), the realm of diagnostic practices lacks a unified approach. Clinically, LBP exhibits a multifaceted character, encompassing conventional assessments of severity and disability, alongside nuanced attributes like pain characterization, duration, and patient expectations.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!