Cogn Affect Behav Neurosci
February 2018
Balancing exploration and exploitation is a fundamental problem in reinforcement learning. Previous neuroimaging studies of the exploration-exploitation dilemma could not completely disentangle these two processes, making it difficult to unambiguously identify their neural signatures. We overcome this problem using a task in which subjects can either observe (pure exploration) or bet (pure exploitation).
View Article and Find Full Text PDFClassification of neurons into clusters based on their response properties is an important tool for gaining insight into neural computations. However, it remains unclear to what extent neurons fall naturally into discrete functional categories. We developed a Bayesian method that models the tuning properties of neural populations as a mixture of multiple types of task-relevant response patterns.
View Article and Find Full Text PDFIntroduction: The transition from pre-dialysis chronic kidney disease (CKD) to post-dialysis start is a critical period associated with high patient mortality and increased hospital admissions. Little is known about the trends of key clinical and laboratory parameters through this time of transition to start dialysis.
Methods: De-identified data including demographics, vital signs, lab results, and eGFR from the Fresenius Medical Care-CKD Registry were analyzed to determine trends in clinical and laboratory parameters through the time of transition from 12 months pre-dialysis start to 12 months post-dialysis start.
When we evaluate an option, how is the neural representation of its value linked to information that identifies it, such as its position in space? We hypothesized that value information and identity cues are not bound together at a particular point but are represented together at the single unit level throughout the entirety of the choice process. We examined neuronal responses in two-option gambling tasks with lateralized and asynchronous presentation of offers in five reward regions: orbitofrontal cortex (OFC, area 13), ventromedial prefrontal cortex (vmPFC, area 14), ventral striatum (VS), dorsal anterior cingulate cortex (dACC), and subgenual anterior cingulate cortex (sgACC, area 25). Neuronal responses in all areas are sensitive to the positions of both offers and of choices.
View Article and Find Full Text PDFWe frequently need to commit to a choice to achieve our goals; however, the neural processes that keep us motivated in pursuit of delayed goals remain obscure. We examined ensemble responses of neurons in macaque dorsal anterior cingulate cortex (dACC), an area previously implicated in self-control and persistence, in a task that requires commitment to a choice to obtain a reward. After reward receipt, dACC neurons signaled reward amount with characteristic ensemble firing rate patterns; during the delay in anticipation of the reward, ensemble activity smoothly and gradually came to resemble the postreward pattern.
View Article and Find Full Text PDFStudies of animal impulsivity generally find steep subjective devaluation, or discounting, of delayed rewards - often on the order of a 50% reduction in value in a few seconds. Because such steep discounting is highly disfavored in evolutionary models of time preference, we hypothesize that discounting tasks provide a poor measure of animals' true time preferences. One prediction of this hypothesis is that estimates of time preferences based on these tasks will lack external validity, i.
View Article and Find Full Text PDFDecision makers are curious and consequently value advance information about future events. We made use of this fact to test competing theories of value representation in area 13 of orbitofrontal cortex (OFC). In a new task, we found that monkeys reliably sacrificed primary reward (water) to view advance information about gamble outcomes.
View Article and Find Full Text PDFHuman decision-makers often exhibit the hot-hand phenomenon, a tendency to perceive positive serial autocorrelations in independent sequential events. The term is named after the observation that basketball fans and players tend to perceive streaks of high accuracy shooting when they are demonstrably absent. That is, both observing fans and participating players tend to hold the belief that a player's chance of hitting a shot are greater following a hit than following a miss.
View Article and Find Full Text PDFRecent theories suggest that reward-based choice reflects competition between value signals in the ventromedial prefrontal cortex (vmPFC). We tested this idea by recording vmPFC neurons while macaques performed a gambling task with asynchronous offer presentation. We found that neuronal activity shows four patterns consistent with selection via mutual inhibition: (1) correlated tuning for probability and reward size, suggesting that vmPFC carries an integrated value signal; (2) anti-correlated tuning curves for the two options, suggesting mutual inhibition; (3) neurons rapidly come to signal the value of the chosen offer, suggesting the circuit serves to produce a choice; and (4) after regressing out the effects of option values, firing rates still could predict choice-a choice probability signal.
View Article and Find Full Text PDFThe dorsal anterior cingulate cortex (dACC) is a key hub of the brain's executive control system. Although a great deal is known about its role in outcome monitoring and behavioral adjustment, whether and how it contributes to the decision process remain unclear. Some theories suggest that dACC neurons track decision variables (e.
View Article and Find Full Text PDFMovies, vacations, and meals are all examples of events composed of a sequence of smaller events. How do we go from our evaluations of each scene in a movie to an evaluation of the sequence as a whole? In theory, we should simply average the values of the individual events. In practice, however, we are biased towards sequences where each element tends to be better than the previous, where the last value is large, and we overweight the best (or worst) part of the sequence.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
September 2013
Intertemporal choice tasks, which pit smaller/sooner rewards against larger/later ones, are frequently used to study time preferences and, by extension, impulsivity and self-control. When used in animals, many trials are strung together in sequence and an adjusting buffer is added after the smaller/sooner option to hold the total duration of each trial constant. Choices of the smaller/sooner option are not reward maximizing and so are taken to indicate that the animal is discounting future rewards.
View Article and Find Full Text PDF