A Reinforcement Learning Model with Choice Traces for a Progressive Ratio Schedule-Reference-Cited by-同舟云学术

A Reinforcement Learning Model with Choice Traces for a Progressive Ratio Schedule

Published:2023-09-27 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Ihara Keiko^ORCID,Shikano Yu^ORCID,Kato Sae,Yagishita Sho^ORCID,Tanaka Kenji F.^ORCID,Takata Norio^ORCID

Abstract

AbstractThe progressive ratio (PR) lever-press task serves as a benchmark for assessing goal-oriented motivation. However, a well-recognized limitation of the PR task is that only a single data point, known as the breakpoint, is obtained from an entire session as a barometer of motivation. Because the breakpoint is defined as the final ratio of responses achieved in a PR session, variations in choice behavior during the PR task cannot be captured. We addressed this limitation by constructing four reinforcement learning models: a Simple Q- learning model, an Asymmetric model with two learning rates, a Perseverance model with choice traces, and a Perseverance model without learning. These models incorporated three behavioral choices: reinforced and non-reinforced lever presses and void magazine nosepokes (MNPs), because we noticed that mice performed frequent MNPs during PR tasks. The best model was the Perseverance model, which predicted a gradual reduction in amplitudes of reward prediction errors (RPEs) upon void MNPs. We confirmed the prediction experimentally with fiber photometry of extracellular dopamine (DA) dynamics in the ventral striatum of mice using a fluorescent protein (genetically encoded GPCR activation-based DA sensor: GRABDA2m). We verified application of the model by acute intraperitoneal injection of low-dose methamphetamine (METH) before a PR task, which increased the frequency of MNPs during the PR session without changing the breakpoint. The Perseverance model captured behavioral modulation as a result of increased initial action values, which are customarily set to zero and disregarded in reinforcement learning analysis. Our findings suggest that the Perseverance model reveals effects of psychoactive drugs on choice behaviors during PR tasks.

Publisher

Cold Spring Harbor Laboratory

Reference64 articles.

1. Autonomous Mechanism of Internal Choice Estimate Underlies Decision Inertia

2. A Critique of Fixed and Progressive Ratio Schedules Used to Examine the Neural Substrates of Drug Reinforcement

3. Enhancement of Ambulation-lncreasing Effect of Methamphetamine by Peripherally-Administered 6R-L-Erythro-5,6,7,8-Tetrahydrobiopterin (R-THBP) in Mice

4. A novel strategy for dissecting goal-directed action and arousal components of motivated behavior with a progressive hold-down task.

5. Performance in a GO/NOGO perceptual task reflects a balance between impulsive and instrumental components of behaviour