Humans can learn under a wide variety of feedback conditions. Reinforcement learning (RL), where a series of rewarded decisions must be made, is a particularly important type of learning. Computational and behavioral studies of RL have focused mainly on Markovian decision processes, where the next state depends on only the current state and action. Little is known about non-Markovian decision making, where the next state depends on more than the current state and action. Learning is non-Markovian, for example, when there is no unique mapping between actions and feedback. We have produced a model based on spiking neurons that can handle these non-Markovian conditions by performing policy gradient descent [1]. Here, we examine the model's performance and compare it with human learning and a Bayes optimal reference, which provides an upper-bound on performance. We find that in all cases, our population of spiking neurons model well-describes human performance.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4405578PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0123105PLOS

Publication Analysis

Top Keywords

learning non-markovian
8
non-markovian decision
8
decision making
8
state depends
8
depends current
8
current state
8
state action
8
spiking neurons
8
learning
5
human machine
4

Similar Publications

Passive and active suppression of transduced noise in silicon spin qubits.

Nat Commun

January 2025

Department of Physics and Astronomy, and Institute of Applied Physics, Seoul National University, Seoul, Korea.

Addressing and mitigating decoherence sources plays an essential role in the development of a scalable quantum computing system, which requires low gate errors to be consistently maintained throughout the circuit execution. While nuclear spin-free materials, such as isotopically purified silicon, exhibit intrinsically promising coherence properties for electron spin qubits, the omnipresent charge noise, when converted to magnetic noise under a strong magnetic field gradient, often hinders stable qubit operation within a time frame comparable to the data acquisition time. Here, we demonstrate both open- and closed-loop suppression techniques for the transduced noise in silicon spin qubits, resulting in a more than two-fold (ten-fold) improvement of the inhomogeneous coherence time (Rabi oscillation quality) that leads to a single-qubit gate fidelity of over 99.

View Article and Find Full Text PDF

A language model of problem solving in humans and macaque monkeys.

Curr Biol

January 2025

Institute of Neuroscience, Key Laboratory of Brain Cognition and Brain-inspired Intelligence Technology, Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai 200031, China. Electronic address:

Human intelligence is characterized by the remarkable ability to solve complex problems by planning a sequence of actions that takes us from an initial state to a desired goal state. Quantifying and comparing problem-solving capabilities across species and finding their evolutionary roots are critical for understanding how the brain carries out this intricate process. We introduce the Language of Problem Solving (LoPS) model as a novel quantitative framework that investigates the structure of problem-solving behavior through a language model.

View Article and Find Full Text PDF
Article Synopsis
  • This study introduces a machine learning-based solver for accurately simulating dissipative quantum dynamics in open quantum systems.
  • The proposed neural quantum propagator model leverages neural networks to efficiently evolve quantum states without the need for lengthy iterations.
  • The model's effectiveness is showcased through simulations of population dynamics and spectra for the Fenna-Matthews-Olson complex.
View Article and Find Full Text PDF

Nonadiabatic dynamics is key for understanding solar energy conversion and photochemical processes in condensed phases. This often involves the non-Markovian dynamics of the reduced density matrix in open quantum systems, where knowledge of the system's prior states is necessary to predict its future behavior. In this study, we explore time-series machine learning methods for predicting long-time nonadiabatic dynamics based on short-time input data, comparing these methods with the physics-based transfer tensor method (TTM).

View Article and Find Full Text PDF

Self-interacting random walks (SIRWs) show long-range memory effects that result from the interaction of the random walker at time t with the territory already visited at earlier times t^{'} View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!