Reinforcement learning (RL)-based brain-machine interfaces (BMIs) learn the mapping from neural signals to subjects' intention using a reward signal. External rewards (water or food) or internal rewards extracted from neural activity are leveraged to update the parameters of decoders in the existing RL-based BMI framework. However, for complex tasks, the design of external reward could be difficult, which may not fully reflect the subject's own evaluation internally. It is important to obtain an internal reward model from neural activity to access subject's internal evaluation when the subject is performing the task through trial and error. In this paper, we propose to use an inverse reinforcement learning (IRL) method to estimate the internal reward function interpreted from the brain to assist the update of the decoders. Specifically, the inverse Q-learning (IQL) algorithm is applied to extract internal reward information from real data collected from medial prefrontal cortex (mPFC) when a rat was learning a two-lever-press discrimination task. Such an internal reward information is validated by checking whether it can guide the training of the RL decoder to complete movement task. Compared with the RL decoder trained with the external reward, our approach achieves a similar decoding performance. This preliminary result validates the effectiveness of using IRL to obtain the internal reward model. It reveals the potential of estimating internal reward model to improve the design of autonomous learning BMIs.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/EMBC48229.2022.9871194 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!