Abstract
AbstractObsessive-compulsive disorder (OCD) has been suggested to be associated with impairment of model-based behavioral control. Meanwhile, recent work suggested shorter memory trace for negative than positive prediction errors (PEs) in OCD. Relation between these two suggestions remains unclear. We addressed this issue through computational modeling. Based on the properties of cortico- basal ganglia pathways, we modeled human as an agent having a combination of successor representation (SR)-based system that enables model-based-like control and individual representation (IR)-based system that only hosts model-free control, with the two systems potentially learning from positive and negative PEs in different rates. We simulated the agent’s behavior in the environmental model used in the recent work that describes potential development of obsession-compulsion cycle. We found that the dual-system agent could develop enhanced obsession-compulsion cycle, similarly to the agent having shorter memory trace for negative PEs in the recent work, if the SR- and IR-based systems learned mainly from positive and negative PEs, respectively. We then simulated the behavior of such an opponent SR+IR agent in the two-stage decision task, in comparison with the agent having only SR-based control. Fitting of the agents’ behavior by the model weighing model-based and model- free control developed in the original two-stage task study resulted in smaller weights of model-based control for the opponent SR+IR agent than for the SR-only agent. These results reconcile the previous suggestions about OCD, i.e., impaired model-based control and shorter memory trace for negative PEs, raising a novel possibility that opponent learning in model(SR)-based and model-free controllers underlies obsession-compulsion. As a limitation, our model cannot explain the behavioral patterns of OCD patients in punishment, rather than reward, contexts. However, we argue that it could be resolved if the opponent SR+IR learning operates also in the recently revealed non-canonical cortico-basal ganglia-dopamine circuit for threat/aversiveness, rather than reward, reinforcement learning.Author summaryObsessive-compulsive disorder (OCD) is one of the major psychiatric disorders diagnosed in 2.5%-3% of the population, and is characterized as an enhanced cycle of obsessive thought, e.g., whether the door was locked, and compulsive action, e.g., checking door lock. It remains elusive why such an apparently maladaptive behavior could be enhanced. A prevailing theory proposes that humans use two control systems, flexible yet costly goal-directed system and inflexible yet costless habitual system, and impairment of the goal-directed system leads to OCD. On the other hand, recent work proposed a new theory that shorter memory trace for credit-assignment of negative, than positive, prediction errors can induce OCD. Relation between these two theories remains unclear. We show that opponent learning of particular type of goal-directed(-like) system, suggested to be implemented in the brain, and habitual system from positive versus negative prediction errors could exhibit an (apparent) overall decrease in goal-directed control and also develop enhanced obsession-compulsion cycle similar to the one developed by memory-trace imbalance, thereby bridging the two theories. Such an opponent learning of the two systems was actually suggested to be advantageous in certain dynamic environments, and could thus be evolutionarily selected at the cost of possible development of OCD.
Publisher
Cold Spring Harbor Laboratory
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献