IEEE Trans Pattern Anal Mach Intell
November 2023
Reinforcement Learning (RL) can be considered as a sequence modeling task, where an agent employs a sequence of past state-action-reward experiences to predict a sequence of future actions. In this work, we propose State-Action-Reward Transformer (StARformer), a Transformer architecture for robot learning with image inputs, which explicitly models short-term state-action-reward representations (StAR-representations), essentially introducing a Markovian-like inductive bias to improve long-term modeling. StARformer first extracts StAR-representations using self-attending patches of image states, action, and reward tokens within a short temporal window.
View Article and Find Full Text PDFEarly visual cortex exhibits widespread hemodynamic responses in the absence of visual stimulation, which are entrained to the timing of a task and not predicted by local spiking or local field potential. Such task-related responses (TRRs) covary with reward magnitude and physiological signatures of arousal. It is unknown, however, if TRRs change on a trial-to-trial basis according to behavioral performance and task difficulty.
View Article and Find Full Text PDFTarget Oncol
September 2021
Background: The results of the pivotal RESORCE trial led to the approval of the tyrosine kinase inhibitor regorafenib as second-line treatment in advanced hepatocellular carcinoma (HCC) after sorafenib failure. Data about prognostic factors in a second-line HCC setting are scarce.
Objective: The aim of the present study was to investigate prognostic factors in a cohort of patients with advanced HCC treated with regorafenib after progressing on sorafenib.
The brain exhibits widespread endogenous responses in the absence of visual stimuli, even at the earliest stages of visual cortical processing. Such responses have been studied in monkeys using optical imaging with a limited field of view over visual cortex. Here, we used functional MRI (fMRI) in human participants to study the link between arousal and endogenous responses in visual cortex.
View Article and Find Full Text PDFDecisions often involve the consideration of multiple cues, each of which may inform selection on the basis of learned probabilities. Our ability to use probabilistic inference for decisions is bounded by uncertainty and constraints such as time pressure. Previous work showed that when humans choose between visual objects in a multiple-cue, probabilistic task, they cope with time pressure by discounting the least informative cues, an example of satisficing or "good enough" decision-making.
View Article and Find Full Text PDF