To realize the potential of autonomous underwater robots that scale up our observational capacity in the ocean, new techniques are needed. Fleets of autonomous robots could be used to study complex marine systems and animals with either new imaging configurations or by tracking tagged animals to study their behavior. These activities can then inform and create new policies for community conservation. The role of animal connectivity via active movement of animals represents a major knowledge gap related to the distribution of deep ocean populations. Tracking underwater targets represents a major challenge for observing biological processes in situ, and methods to robustly respond to a changing environment during monitoring missions are needed. Analytical techniques for optimal sensor placement and path planning to locate underwater targets are not straightforward in such cases. The aim of this study was to investigate the use of reinforcement learning as a tool for range-only underwater target-tracking optimization, whose promising capabilities have been demonstrated in terrestrial scenarios. To evaluate its usefulness, a reinforcement learning method was implemented as a path planning system for an autonomous surface vehicle while tracking an underwater mobile target. A complete description of an open-source model, performance metrics in simulated environments, and evaluated algorithms based on more than 15 hours of at-sea field experiments are presented. These efforts demonstrate that deep reinforcement learning is a powerful approach that enhances the abilities of autonomous robots in the ocean and encourages the deployment of algorithms like these for monitoring marine biological systems in the future.

Download full-text PDF

Source
http://dx.doi.org/10.1126/scirobotics.ade7811DOI Listing

Publication Analysis

Top Keywords

reinforcement learning
16
tracking underwater
12
underwater targets
12
autonomous robots
8
represents major
8
path planning
8
underwater
6
dynamic robotic
4
tracking
4
robotic tracking
4

Similar Publications

The design of the illumination optics for high numerical aperture (NA) anamorphic extreme ultraviolet (EUV) projection optics is a critical challenge to EUV lithography in advanced technology node. However, the EUV illumination optics design using conventional methods have flaws in illumination efficiency and illumination uniformity due to the limitations of relay configuration and matching method that can only consider one factor affecting illumination uniformity. One-mirror configuration can improve illumination efficiency by reducing the number of mirrors.

View Article and Find Full Text PDF

Whether working memory (WM) is encoded by persistent activity using attractors or by dynamic activity using transient trajectories has been debated for decades in both experimental and modeling studies, and a consensus has not been reached. Even though many recurrent neural networks (RNNs) have been proposed to simulate WM, most networks are designed to match respective experimental observations and show either transient or persistent activities. Those few which consider networks with both activity patterns have not attempted to directly compare their memory capabilities.

View Article and Find Full Text PDF

Recreational use of nitrous oxide (NO) has risen dramatically over the past decades. This study aimed to examine its rewarding effect and the underlying mechanisms. The exposure of mice to a subanesthetic concentration (20%) of NO for 30 min for 4 consecutive days paired with NO in the morning and paired with the air in the afternoon produced apparent rewarding behavior in the conditioned place preference (CPP) paradigm.

View Article and Find Full Text PDF

Exploring when to exploit: the cognitive underpinnings of foraging-type decisions in relation to psychopathy.

Transl Psychiatry

January 2025

Radboud University, Donders Institute for Brain, Cognition and Behavior, Thomas van Aquinostraat 4, 6525 GD, Nijmegen, The Netherlands.

Impairments in reinforcement learning (RL) might underlie the tendency of individuals with elevated psychopathic traits to behave exploitatively, as they fail to learn from their mistakes. Most studies on the topic have focused on binary choices, while everyday functioning requires us to learn the value of multiple options. In this study, we evaluated the cognitive correlates of naturalistic foraging-type decision-making and their electrophysiological signatures in a community sample (n = 108) with varying degrees of psychopathic traits.

View Article and Find Full Text PDF

The accumulation of construction solid waste (CSW) leads to the waste of land resources and environmental pollution, becoming a significant social problem. Identifying the amount of high-value CSW is essential for assessing the value of accumulated CSW and formulating appropriate recycling strategies. With the development of machine learning technology, CSW recognition techniques combining image acquisition devices and convolutional neural networks have been widely applied.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!