An online adaptive learning algorithm for optimal trade execution in high-frequency markets
No Thumbnail Available
Date
2016
Authors
Hendricks, Dieter
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Automated algorithmic trade execution is a central problem in modern financial markets,
however finding and navigating optimal trajectories in this system is a non-trivial
task. Many authors have developed exact analytical solutions by making simplifying
assumptions regarding governing dynamics, however for practical feasibility and robustness,
a more dynamic approach is needed to capture the spatial and temporal system
complexity and adapt as intraday regimes change.
This thesis aims to consolidate four key ideas: 1) the financial market as a complex
adaptive system, where purposeful agents with varying system visibility collectively and
simultaneously create and perceive their environment as they interact with it; 2) spin
glass models as a tractable formalism to model phenomena in this complex system; 3) the
multivariate Hawkes process as a candidate governing process for limit order book events;
and 4) reinforcement learning as a framework for online, adaptive learning. Combined
with the data and computational challenges of developing an efficient, machine-scale
trading algorithm, we present a feasible scheme which systematically encodes these ideas.
We first determine the efficacy of the proposed learning framework, under the conjecture
of approximate Markovian dynamics in the equity market. We find that a simple lookup
table Q-learning algorithm, with discrete state attributes and discrete actions, is able
to improve post-trade implementation shortfall by adapting a typical static arrival-price
volume trajectory with respect to prevailing market microstructure features streaming
from the limit order book.
To enumerate a scale-specific state space whilst avoiding the curse of dimensionality, we
propose a novel approach to detect the intraday temporal financial market state at each
decision point in the Q-learning algorithm, inspired by the complex adaptive system
paradigm. A physical analogy to the ferromagnetic Potts model at thermal equilibrium
is used to develop a high-speed maximum likelihood clustering algorithm, appropriate
for measuring critical or near-critical temporal states in the financial system. State
features are studied to extract time-scale-specific state signature vectors, which serve as
low-dimensional state descriptors and enable online state detection.
To assess the impact of agent interactions on the system, a multivariate Hawkes process is
used to measure the resiliency of the limit order book with respect to liquidity-demand
events of varying size. By studying the branching ratios associated with key quote
replenishment intensities following trades, we ensure that the limit order book is expected
to be resilient with respect to the maximum permissible trade executed by the agent.
Finally we present a feasible scheme for unsupervised state discovery, state detection
and online learning for high-frequency quantitative trading agents faced with a multifeatured,
asynchronous market data feed. We provide a technique for enumerating the
state space at the scale at which the agent interacts with the system, incorporating the
effects of a live trading agent on limit order book dynamics into the market data feed,
and hence the perceived state evolution.
Description
A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy
in the Faculty of Science, School of Computer Science and Applied Mathematics
University of the Witwatersrand. October 2016.
Keywords
Citation
Hendricks, Dieter (2016) An online adaptive learning algorithm for optimal trade execution in high-frequency markets, University of Witwatersrand, Johannesburg, <http://wiredspace.wits.ac.za/handle/10539/21710>