Affiliation:
1. 1(Department of Psychology, Duke University, Durham, N. Carolina, U.S.A. 27706
2. 2(Department of Psychology, Duke University, Durham, N. Carolina, U.S.A. 27706
Abstract
AbstractPigeons were rewarded with food for pecking keys in various forms of two-armed bandit situation for an extended series of daily sessions in two experiments. The average daily preference (S=R/[R+L]) is very well fit by a markovian linear model in which predicted preference today is an average of predicted preference yesterday and reinforcement conditions today: s(N+1) = as(N) + (1-a)A(N+1), where A(N+1) is set equal to 1 when all rewards are for the Right response, and 0 when all are for the Left, and a is a longterm memory parameter. This linear model explains some apparent paradoxes in earlier reports of memory effects in two-armed bandit experiments. Nevertheless, closer examination of the details of preference changes within each experimental session showed several kinds of non-markovian effects. The most important was a regression at the beginning of each experimental session towards a preference characteristic of earlier sessions (spontaneous recovery). This effect, but not a smaller, less reliable non-markovian reminiscence effect, is consistent with a very simple rule, namely that the effect on preference of each individual reward for a Right or Left response is inversely related to how long ago the reward occurred. Thus, animals learn to prefer the rewarded side each day because these rewards are recent; but they regress to earlier preferences overnight because the most recent rewards become relatively less recent with lapse of time.
Subject
Behavioral Neuroscience,Animal Science and Zoology
Cited by
12 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献