A computational model of integration between reinforcement learning and task monitoring in the prefrontal cortex

Mehdi Khamassi, René Quilodran, Pierre Enel, Emmanuel Procyk, Peter F. Dominey

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Taking inspiration from neural principles of decision-making is of particular interest to help improve adaptivity of artificial systems. Research at the crossroads of neuroscience and artificial intelligence in the last decade has helped understanding how the brain organizes reinforcement learning (RL) processes (the adaptation of decisions based on feedback from the environment). The current challenge is now to understand how the brain flexibly regulates parameters of RL such as the exploration rate based on the task structure, which is called meta-learning [1] Doya, 2002). Here, we propose a computational mechanism of exploration regulation based on real neurophysiological and behavioral data recorded in monkey prefrontal cortex during a visuo-motor task involving a clear distinction between exploratory and exploitative actions. We first fit trial-by-trial choices made by the monkeys with an analytical reinforcement learning model. We find that the model which has the highest likelihood of predicting monkeys' choices reveals different exploration rates at different task phases. In addition, the optimized model has a very high learning rate, and a reset of action values associated to a cue used in the task to signal condition changes. Beyond classical RL mechanisms, these results suggest that the monkey brain extracted task regularities to tune learning parameters in a task-appropriate way. We finally use these principles to develop a neural network model extending a previous cortico-striatal loop model. In our prefrontal cortex component, prediction error signals are extracted to produce feedback categorization signals. The latter are used to boost exploration after errors, and to attenuate it during exploitation, ensuring a lock on the currently rewarded choice. This model performs the task like monkeys, and provides a set of experimental predictions to be tested by future neurophysiological recordings.

Original languageEnglish
Title of host publicationFrom Animals to Animats 11 - 11th International Conference on Simulation of Adaptive Behavior, SAB 2010, Proceedings
Pages424-434
Number of pages11
DOIs
StatePublished - 2010
Externally publishedYes
Event11th International Conference on the Simulation of Adaptive Behavior, SAB 2010 - Paris-Clos Luce, France
Duration: 25 Aug 201028 Aug 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6226 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference11th International Conference on the Simulation of Adaptive Behavior, SAB 2010
Country/TerritoryFrance
CityParis-Clos Luce
Period25/08/1028/08/10

Fingerprint

Dive into the research topics of 'A computational model of integration between reinforcement learning and task monitoring in the prefrontal cortex'. Together they form a unique fingerprint.

Cite this