Reinforcement Learning of Two-Joint Virtual Arm Reaching in a Computer Model of Sensorimotor Cortex

Author:

Neymotin Samuel A.1,Chadderdon George L.2,Kerr Cliff C.3,Francis Joseph T.4,Lytton William W.5

Affiliation:

1. Department of Neurobiology, Yale University School of Medicine, New Haven, CT 06510, U.S.A., and Department of Physiology and Pharmacology, SUNY Downstate, Brooklyn, NY 11203, U.S.A.

2. Department of Physiology and Pharmacology, SUNY Downstate, Brooklyn, NY 11203, U.S.A.

3. Department of Physiology and Pharmacology, SUNY Downstate, Brooklyn, NY 11203, U.S.A., and School of Physics, University of Sydney, Sydney 2050, Australia

4. Department of Physiology and Pharmacology, Program in Neural and Behavioral Science, and Robert F. Furchgott Center for Neural and Behavioral Science, SUNY Downstate, Brooklyn, NY 11203, U.S.A., and Joint Program in Biomedical Engineering, NYU Poly and SUNY Downstate, Brooklyn, NY 11203, U.S.A.

5. Department of Physiology and Pharmacology, Department of Neurology, Program in Neural and Behavioral Science, and Robert F. Furchgott Center for Neural and Behavioral Science, SUNY Downstate, Brooklyn, NY 11203, U.S.A.; Joint Program in Biomedical Engineering, NYU Poly and SUNY Downstate, Brooklyn, NY 11203, U.S.A.; and Department of Neurology, Kings County Hospital, Brooklyn, NY 11203, U.S.A.

Abstract

Neocortical mechanisms of learning sensorimotor control involve a complex series of interactions at multiple levels, from synaptic mechanisms to cellular dynamics to network connectomics. We developed a model of sensory and motor neocortex consisting of 704 spiking model neurons. Sensory and motor populations included excitatory cells and two types of interneurons. Neurons were interconnected with AMPA/NMDA and GABAA synapses. We trained our model using spike-timing-dependent reinforcement learning to control a two-joint virtual arm to reach to a fixed target. For each of 125 trained networks, we used 200 training sessions, each involving 15 s reaches to the target from 16 starting positions. Learning altered network dynamics, with enhancements to neuronal synchrony and behaviorally relevant information flow between neurons. After learning, networks demonstrated retention of behaviorally relevant memories by using proprioceptive information to perform reach-to-target from multiple starting positions. Networks dynamically controlled which joint rotations to use to reach a target, depending on current arm position. Learning-dependent network reorganization was evident in both sensory and motor populations: learned synaptic weights showed target-specific patterning optimized for particular reach movements. Our model embodies an integrative hypothesis of sensorimotor cortical learning that could be used to interpret future electrophysiological data recorded in vivo from sensorimotor learning experiments. We used our model to make the following predictions: learning enhances synchrony in neuronal populations and behaviorally relevant information flow across neuronal populations, enhanced sensory processing aids task-relevant motor performance and the relative ease of a particular movement in vivo depends on the amount of sensory information required to complete the movement.

Publisher

MIT Press - Journals

Subject

Cognitive Neuroscience,Arts and Humanities (miscellaneous)

Cited by 31 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3