Trading Strategies via Book Imbalance
Umberto Pesavento joint work with Alexander Lipton and Michael G. Sotiropoulos Algorithmic Trading Quantitative Research Bank of America Merrill Lynch
Financial Engineering Workshops, Cass Business School City University London, 8 October 2014
U. Pesavento, Bank of America Merrill Lynch
1 of 26
8 October 2014
Contents
Limit order books and algorithmic trading Empirical observations Modeling the bid and ask queues Adding trade arrival dynamics Calibration Conclusions and current work Appendix References
Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
2 of 26
8 October 2014
volume
Limit order books and algorithmic trading: a top down approach
t0
t1
t2
t3
t4
t5
ask
Buy trade
bid
ti
135 500 134 1000 133 1500 132 800 131 300
130 500 129 1000 128 1500 127 800 126 300
134 1000 133 1500 132 800 131 100
130 500 129 1000 128 1500 127 800 126 300
t6
t7
t8
time
135 1000 134 1500 133 800 132 100
131 500 130 1000 129 1500 128 800 127 300
134 500 133 1000 132 1500 131 800 130 300
Sell trade
129 500 128 1000 127 1500 126 800 125 300
ti+1
A divide and conquer approach based on 3 phases: • calculating a time-volume schedule; • limit order placement: optimal trading of the allocated shares within the given horizon; • venue allocation. Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
3 of 26
8 October 2014
Empirical observations: stopping times and book averaging
t0 t1 t2 t˜0
A separation of time scales: • queue length updates (≈ 3 s); • best bid-ask updates (≈ 8 s);
t3 t4 t˜1
• trade arrivals (≈ 12 s).
t5 t6 t˜2 ...
p0b p1b p2b p˜0 p3b p4b p˜1 p5b p6b p˜2 ...
p0a p1a p2a q˜0 p3a p4a q˜1 p5a p6a q˜2 ...
q0b q1b q2b s˜0 q3b q4b s˜1 q5b q6b s˜2 ...
q0a q1a q2a q3a q4a q5a q6a ...
Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
4 of 26
8 October 2014
Empirical observations: mid price movements conditional on an book imbalance
Non-martingale properties of prices at small time scale: • future price variations can be predicted by using the imbalance in the bid and ask queues of the order book I = (q b − q a )/(q b + q a ); • statistically significant (about 107 data points for the plot above); • the effect is not large enough to lead to a straightforward arbitrage but significant enough to yield savings in execution costs.
Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
5 of 26
8 October 2014
Empirical observations: trade arrivals and related stopping times
Changing the stopping time: • the overall trend in the expected price movements as a function of the book imbalance is the same; • conditioning on the arrival of trades on a particular side of the book breaks the symmetry in the expected waiting time and price movements.
Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
6 of 26
8 October 2014
ask queue (down−tick)
Modeling the bid and ask queues: replenishment processes and the constant spread approximation
ask
initial queues
bid
replenished queues
135 133 132 131
3700 6500 9000 7200
130 129 128 127
8600 5000 3000 4500
135 133 132
3700 6500 9000
130 129 128 127
8600 5000 3000 4500
depletion
replenishment
137 135 133 132
1500 3700 6500 9000
131 130 129 128
7200 8600 5000 3000
bid queue (up−tick)
Simple models for queues and price dynamics:
.
• two correlated diffusion processes to represent the bid and ask queues (q b , q a ) = (W b , W a ); • price moves are associated with queues depletion; • queues replenishment, drawing from a stationary distribution; • assume constant spreads (not a bad approximation for liquid stocks)
Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
7 of 26
8 October 2014
Modeling the bid and ask queues: time dependent dynamics 1 1 Pxx + Pyy + ρxy Pxy = 0, (1) 2 2 where ρxy is the correlation between the processes governing the depletion and replenishment of the bid and ask queues, which typically takes a negative value in a normal market. α(x, y ) = x (ρxy x − y ) (2) , β(x, y ) = − q 1 − ρ2 Pt +
xy
yielding equation: 1 1 Pαα + Pββ = 0. 2 2 And the second to cast the problem in polar coordinates: p ( 2 2 r = α +β α = − r sin(ϕ − $) ←→ α β =r cos(ϕ − $) ϕ =$ + arctan − . β where cos $ = −ρxy , so to yield the following equation for the hitting probabilities: 1 1 1 Vrr + Pr + 2 Pϕϕ = 0. Pt + 2 r r with the final condition: P(T , T , r , ϕ) = 0 and boundary conditions: Pt +
P(t, T , 0, ϕ) = 0,
P(t, T , ∞, ϕ) = 0,
P (t, T , r , 0) = P0 ,
(3)
(4)
(5)
P(t, T , r , $) = P1 .
Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
8 of 26
8 October 2014
Modeling the bid and ask queues: Green’s function formulation We seek the Green’s function to equation (5) by separating its radial and angular components: 0
0
0
0
G(τ, r , ϕ ) = g(τ, r )f (ϕ ), This leads to two equations coupled by the positive constant Λ2 :
gτ =
1 2
gr 0 r 0 +
Λ2 1 gr 0 − 02 g 0 r r
(6)
! (7)
,
2
fϕ0 ϕ0 = − Λ f .
(8)
The radial part is solved by: e−
r 02 +r 2 0 2τ
r 0 r0 , (9) τ τ where IΛ (ξ) is the modified Bessel function of the first kind corresponding to Λ. After applying the boundary conditions on the angular part of the equation, the final formula for the Green’s function is: 0
g(τ, r ) =
r 02 +r 2 0
0
G τ, r0 , r , ϕ0 , ϕ
0
2e− 2τ = $τ
∞ X n=1
IΛ
Iνn
r 0 r0 τ
0
sin νn ϕ
sin (νn ϕ0 ).
(10)
where νn = nπ ω ¯ . Finally, we integrate the equation above to obtain the hitting probability for the of an up-tick (or down-tick) conditional on the initial condition of the queue.
P(t, T , r0 , ϕ0 ) = −
ˆTˆ∞ 1 1 0 0 Gϕ (t − t, r , $) dr dt . 2 r
(11)
t 0
Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
9 of 26
8 October 2014
Modeling the bid and ask queues: infinite time limit By writing out the explicit form for the Green’s function we obtain: 2 2 ˆ T ˆ ∞ − r +r0 ∞ X 2t rr0 e n+1 Iνn dtdr νn sin (νn φ0 ) . P (0, r0 , φ0 ) = (−1) $tr t n=1
0
(12)
0
We reverse the order of integration and evaluate the time integral using the following expression: ˆ
∞
0
r 2 +r 2 0
e− 2t $tr
Iνn
rr0 t
1
dt = $νn r
p
s2 − 1 + s
(13)
νn
where s = (r 2 + r02 )/2rr0 . We can then integrate along the radial component, ˆ
∞ 0
1 $νn r max rr , 0
r0 r
νn dr =
1 $νn r0νn
ˆ
r0
0
r
νn −1
dr +
r0νn $νn
ˆ
∞
r
−νn −1
r0
dr =
2 $νn2
. (14)
Finally, we sum the series to obtain: P (0, r0 , φ0 ) =
∞ 2 X (−1)n+1 πn φ0 sin φ0 = . π n=1 n $ $
(15)
As expected, the result depends only on the angular distance from the barrier.
Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
10 of 26
8 October 2014
Modeling the bid and ask queues: time independent formulation I
We now consider the time-independent problem from the onset: 1 1 Pxx + Pyy + ρxy Pxy = 0, 2 2
(16)
P (x, 0) = 1,
(17)
P (0, y ) = 0.
Again, we perform a change of coordinates to eliminate the correlation term, α(x, y ) = x (−ρxy x + y ) (18) , β(x, y ) = q 1 − ρ2 xy
yielding equation: Pαα + Pββ = 0.
(19)
Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
11 of 26
8 October 2014
Modeling the bid and ask queues: time independent formulation II
We then perform a the second transformation to casts the modified problem in polar coordinates: p ( 2 2 r = α +β α =r sin(ϕ) ←→ (20) α β =r cos(ϕ) ϕ =arctan , β where cos $ = −ρxy . Then the equation becomes Pϕϕ (ϕ) = 0,
(21)
with boundary conditions P(0) = 0 and P($) = 1. In this coordinate set the solution is straightforward P(ϕ) = ϕ/$, which in the original set of coordinates has the form: r 1+ρ y −x arctan( 1−ρxy ) xy y +x 1 . P(x, y ) = 1− r 2 1+ρ arctan( 1−ρxy ) xy (22)
Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
12 of 26
8 October 2014
Adding trade arrival dynamics: the trade arrival process In analogy with the two Brownian processes representing the bid and ask queues, we add a third (unobservable) process to model trade arrival on the near side of the book: b
a
b
a
φ
(dq , dq , dφ) = (dw , dw , dw )
Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
13 of 26
8 October 2014
Adding trade arrival dynamics: handling correlation 1 1 1 Pxx + Pyy + Pzz + ρxy Pxy + ρxz Pxz + ρyz Pyz 2 2 2 as in two dimensions, it is possible to eliminate the correlation terms,
=
0
(23)
α(x, y , z) =x xy x + y ) β(x, y , z) = (−ρ q 1 − ρ2xy h i (ρxy ρyz − ρxz ) x + (ρxy ρxz − ρyz ) y + (1 − ρ2xy )z , q γ(x, y , z) = q 1 − ρ2xy 1 − ρ2xy − ρ2xz − ρ2yz + 2ρxy ρxz ρyz
(24)
Pαα + Pββ + Pγγ = 0,
(25)
to obtain:
Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
14 of 26
8 October 2014
Adding trade arrival dynamics: changing the domain Again, we can write the exit probability problem in a simpler form by changing the computational domain Ω: 1 sin2 θ
Pφφ (φ, θ) +
P (0, θ) = 0,
1 ∂ (sin θPθ (φ, θ)) = 0, sin θ ∂θ
P ($, θ) = 0,
P (φ, Θ (φ)) = 1.
(26) (27)
α =r sin θ sin ϕ β =r sin θ cos ϕ γ =r cos θ
Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
15 of 26
8 October 2014
Adding trade arrival dynamics: semi analytical solutions I We introduce a new variable ζ = ln tan θ/2 and rewrite the exit problem again as [Lipton 2013]: Pφφ (φ, ζ) + Pζζ (φ, ζ) = 0, computational domain is now a semi-infinite strip with curvilinear boundary Θ (φ) . ζ = Z (φ) = ln tan 2 We look for the solution of the Dirichlet problem for the Laplace equation in the form P(ϕ, ζ) =
∞ X
cn sin(kn ϕ),
n=1
kn =
πn $
where the values of expansion coefficients cn can be determined by enforcing the boundary condition P(ϕ, Θ(ϕ)) = 1
(
(28)
(29)
(30)
(31)
ζ = ln tan θ/2 ϕ =ϕ
Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
16 of 26
8 October 2014
Adding trade arrival dynamics: semi analytical solutions II In order to compute the coefficients, we introduce the integrals ˆ$ Jmn =
sin(km ϕ) sin(kn ϕ)e
(kn +km )Z (ϕ)
dϕ,
(32)
0
ˆ$ Im =
sin(km ϕ)e
km Z (ϕ)
dϕ.
(33)
0
Then the boundary condition (31) becomes X
Jmn cn = Im ,
(34)
n
and cn can be computed by matrix inversion as c = J −1 I.
Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
17 of 26
8 October 2014
Calibration: putting everything together
Book event probabilities as a function of the bid-ask imbalance: • left region of the plot, price improvement is likely: get ready to reprice; • central region of the plot, a trade on the near side is likely to anticipate an adverse price move: stay posted; • right region of the plot: consider crossing the spread.
Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
18 of 26
8 October 2014
Calibration: the role of correlation
• Correlation is the main effect responsible for the symmetry breaking in the evolution of the price expectation as a function of imbalance. • It can also explain a big part of the adverse selection effect which we observe when posting orders in a limit order book. • The model can capture the main features of symmetry breaking in the trade arrival process.
Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
19 of 26
8 October 2014
Conclusions and current work: from trade arrival rates to empirical fill probabilities
Empirical fill probabilities, learning from our own execution data: • real data tends to be noisy, but it displays consistent trends • parametric forms of fill probabilities as a function of the limit order placement x can be estimated, i.e. P(x) = 1 − e−βx Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
20 of 26
8 October 2014
Conclusions and current work: from empirical fill probabilities to optimization schedules, a dynamic programming approach
Given an approximate functional form for the fill probability, we can solve the recursive optimization problem given by: E[Pi ] = min ((1 − p(x))E[Pi+1 ] + p(x)x) x
(35)
where p(x) is the fill probability of a limit order with a limit price of x. • what is the optimal placement of a limit order? (blue line) • what is the expected fill price? (red line) Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
21 of 26
8 October 2014
Conclusions and current work: optimizing thresholds
Going on step further, parameter selection and price slippage estimation as a function of the slice size: • as expected, larger slices will produce a larger slippage; • the optimal trade off between waiting and crossing the spread depends on the size of the slice to be executed; • an optimal ridge in the parameter space can be calculated under certain assumptions. Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
22 of 26
8 October 2014
Appendix: order flow and impact
We can also attempt to predict price movements and arrival times by conditioning on local measurements of prevailing order flows rather than book imbalance. Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
23 of 26
8 October 2014
Appendix: a time dependent slice of the problem
Average time evolution of the mid-price across a trade event: • Before trade arrival prices tends to drift towards the near side of the book; • At trade arrival impact dominates and prices moves towards the far side of the book.
Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
24 of 26
8 October 2014
Appendix: queues depletion and replenishment
Depletion of the bid and ask queues across bid up-ticks and down-tick price movements: • down-tick move, the initial queue size is thin while the next layer if fully formed; • up-tick move, the previous layer is fully formed and the next queue distribution is thin; • the ask queue is statistically unaffected.
Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
25 of 26
8 October 2014
References [1]
A. Lipton, U. Pesavento, M.Sotiropoulos, Risk, April, 2014.
[2]
R. Almgren, C. Thum, H. L. Hauptmann, and H. Li. Equity market impact. Risk, 18:57, 2005.
[3]
M. Avellaneda and S. Stoikov. High-frequency trading in a limit order book. Quantitative Finance, 8:217–224, 2008.
[4]
J.-P. Bouchaud, J. D. Farmer, and F. Lillo. How markets slowly digest changes in supply and demand. In T. Hens and K Schenk-Hoppe, editors, Handbook of Financial Markets: Dynamics and Evolution.
[5]
J.-P. Bouchaud, D. Mezard, and M. Potters. Statistical properties of stock order books: empirical results and models. Quantitative Finance Finance, 2:251–256, 2002.
[6]
R. Cont and A. de Larrard. Order book dynamics in liquid markets: limit theorems and diffusion approximations. Working paper, 2012.
[7]
R. F. Engle. The econometrics of ultra-high frequency data. Econometrica, 68:1–22, 2000.
[8]
J. Hasbrouck. Measuring the information content of stock trades. Journal of Finance, 46:179–207, 1991.
[9]
A. Lipton and I. Savescu. CDSs, CVA and DVA - a structural approach. Risk, 26(4), 2013.
[10] S. Stoikov R. Cont and R. Talreja. A stochastic model for order book dynamics. Operations research, 58(3):549–563, 2010. [11] E. Smith, J.D. Farmer, L. Gillemot, and S. Krishnamurthy. Statistical theory of the continuous double auction. Quantitative Finance, 3:481– 514, 2003. Bank of America Merrill Lynch U. Pesavento, Bank of America Merrill Lynch
26 of 26
8 October 2014