Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets-Reference-Cited by-同舟云学术

Asynchronous Deep Double Dueling Q-learning for trading-signal execution in limit order book markets

Published:2023-09-25 Issue: Volume:6 Page:
ISSN:2624-8212
Container-title:Frontiers in Artificial Intelligence
language:
Short-container-title:Front. Artif. Intell.

Author:

Nagy Peer,Calliess Jan-Peter,Zohren Stefan

Abstract

We employ deep reinforcement learning (RL) to train an agent to successfully translate a high-frequency trading signal into a trading strategy that places individual limit orders. Based on the ABIDES limit order book simulator, we build a reinforcement learning OpenAI gym environment and utilize it to simulate a realistic trading environment for NASDAQ equities based on historic order book messages. To train a trading agent that learns to maximize its trading return in this environment, we use Deep Dueling Double Q-learning with the APEX (asynchronous prioritized experience replay) architecture. The agent observes the current limit order book state, its recent history, and a short-term directional forecast. To investigate the performance of RL for adaptive trading independently from a concrete forecasting algorithm, we study the performance of our approach utilizing synthetic alpha signals obtained by perturbing forward-looking returns with varying levels of noise. Here, we find that the RL agent learns an effective trading strategy for inventory management and order placing that outperforms a heuristic benchmark trading strategy having access to the same signal.

Publisher

Frontiers Media SA

Subject

Artificial Intelligence

Reference46 articles.

1. “Adaptive market making via online learning,”;Abernethy,2013

2. Optimal execution of portfolio transactions;Almgren;J. Risk,2001

3. “ABIDES-gym: gym environments for multi-agent discrete event simulation and application to financial markets,”;Amrouni,2021

4. A Markovian decision process;Bellman;J. Math. Mech,1957

5. Deep reinforcement learning for active high frequency trading;Briola;arXiv preprint arXiv:2101.07107,2021