Towards Better Interpretability in Deep Q-Networks-Reference-Cited by-同舟云学术

Towards Better Interpretability in Deep Q-Networks

Published:2019-07-17 Issue: Volume:33 Page:4561-4569
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Annasamy Raghuram Mandyam,Sycara Katia

Abstract

Deep reinforcement learning techniques have demonstrated superior performance in a wide variety of environments. As improvements in training algorithms continue at a brisk pace, theoretical or empirical studies on understanding what these networks seem to learn, are far behind. In this paper we propose an interpretable neural network architecture for Q-learning which provides a global explanation of the model’s behavior using key-value memories, attention and reconstructible embeddings. With a directed exploration strategy, our model can reach training rewards comparable to the state-of-the-art deep Q-learning models. However, results suggest that the features extracted by the neural network are extremely shallow and subsequent testing using out-of-sample examples shows that the agent can easily overfit to trajectories seen during training.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 19 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Customer Acquisition via Explainable Deep Reinforcement Learning;Information Systems Research;2024-05-21

2. A survey on interpretable reinforcement learning;Machine Learning;2024-04-19

3. Leveraging Reward Consistency for Interpretable Feature Discovery in Reinforcement Learning;IEEE Transactions on Systems, Man, and Cybernetics: Systems;2024-02

4. Customer Acquisition Via Explainable Deep Reinforcement Learning;SSRN Electronic Journal;2024

5. FLARE: Fingerprinting Deep Reinforcement Learning Agents using Universal Adversarial Masks;Annual Computer Security Applications Conference;2023-12-04