Affiliation:
1. Imperial College London
Abstract
Deep Q-learning (DQN) is a recently proposed reinforcement learning algorithm where a neural network is applied as a non-linear approximator to its value function. The exploitation-exploration mechanism allows the training and prediction of the NN to execute simultaneously in an agent during its interaction with the environment. Agents often act independently on battery power, so the training and prediction must occur within the agent and on a limited power budget. In this work, We propose an FPGA acceleration system design for Neural Network Q-learning (NNQL). Our proposed system has high flexibility due to the support to run-time network parameterization, which allows neuroevolution algorithms to dynamically restructure the network to achieve better learning results. Additionally, the power consumption of our proposed system is adaptive to the network size because of a new processing element design. Based on our test cases on networks with hidden layer size ranging from 32 to 16384, our proposed system achieves 7x to 346x speedup compared to GPU implementation and 22x to 77x speedup to hand-coded CPU counterpart.
Publisher
Association for Computing Machinery (ACM)
Reference8 articles.
1. A. Karpathy etal Convnetjs deep q learning demo. http://cs.stanford.edu/people/karpathy/convnetjs/. A. Karpathy et al. Convnetjs deep q learning demo. http://cs.stanford.edu/people/karpathy/convnetjs/.
2. A highly scalable Restricted Boltzmann Machine FPGA implementation
Cited by
26 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献