This is an archived documentation site for release 2.4. For the latest documentation or to access any other site features, please return to www.quantrocket.com

Global Trading Blog

Classic pairs trading strategies have suffered deteriorating returns over time. Can a research pipeline that facilitates the identification and selection of ETF pairs make pairs trading viable again? This post investigates such a pipeline.

The problem: pairs wander away

Source: Ernie Chan, Algorithmic Trading: Winning Strategies and Their Rationale, Wiley, May 28, 2013, chapter 4.

Pairs trading is a classic arbitrage strategy on securities in the same industry (for example, Coke and Pepsi) in which the trader buys one security and sells the other when the spread between them widens, then closes the positions when the spread narrows again.

In his book Algorithmic Trading, Ernie Chan notes that pairs trading of stocks has become more difficult over time. Two stocks may cointegrate in-sample, but they often wander apart out-of-sample as the fortunes of the respective companies diverge. However, Chan finds more fertile ground for pairs trading among ETFs.

I backtest a pairs trading strategy using an ETF pair from Chan's book, GLD and GDX (the Gold ETF and Gold Miners ETF), and find that this pair was profitable out-of-sample for 2 years after Chan's book was published but thereafter became unprofitable.

Based on the tendency of pairs to eventually stop cointegrating, I hypothesize that successful pairs trading requires a robust pipeline for continually identifying and selecting new pairs to trade. I attempt to construct such pipeline using a 3-step process:

  1. For a universe of all liquid US ETFs, I test all possible pairs for cointegration using the Johansen test in an in-sample window.
  2. I run in-sample backtests on all cointegrating pairs and select the 5 best performing pairs.
  3. I run an out-of-sample backtest on a portfolio of the 5 best performing pairs.

Pairs trading strategy

I create a Moonshot pairs trading strategy that replicates the trading rules in Chan's book. A few code snippets are highlighed here. The strategy calculates daily hedge ratios using the Johansen test:

from statsmodels.tsa.vector_ar.vecm import coint_johansen

# The second and third parameters indicate constant term, with a lag of 1.
# See Chan, Algorithmic Trading, chapter 2.
result = coint_johansen(pair_prices, 0, 1)

# The first column of eigenvectors contains the best weights
hedge_ratios = list(result.evec[0])

The timing of entries and exits is based on Bollinger Bands set one standard deviation away from the spread's moving average:

# Compute spread and Bollinger Bands
spreads = (pair_prices * hedge_ratios).sum(axis=1)
means = spreads.fillna(method="ffill").rolling(20).mean()
stds = spreads.fillna(method="ffill").rolling(20).std()
upper_bands = means + stds
lower_bands = means - stds

# Long (short) the spread when it crosses below (above) the lower (upper)
# band, then exit when it crosses the mean
long_entries = spreads < lower_bands
long_exits = spreads >= means
short_entries = spreads > upper_bands
short_exits = spreads <= means

See the code repository link at the end of the article for the full source code.

Backtest using GLD and GDX

I backtest the pairs trading strategy using GLD and GDX, the Gold and Gold Miners ETFs, which Chan discusses in his book. Chan's book was published in 2013, so this provides an out-of-sample evaluation of the trading strategy.

The strategy was profitable for the first two years following publication, but was unprofitable thereafter.

Pairs selection pipeline with US ETFs

Next, in a Jupyter notebook, I construct a pipeline for identifying and selecting new pairs.

Step 1: Filter by dollar volume

I begin by collecting 1-day historical bars for all US ETFs. I then filter the universe of ETFs to include only liquid ETFs, defined here as having average daily dollar volume above $10M USD. This results in a universe of 110 ETFs.

Step 2: In-sample cointegration test

110 ETFs can be combined into nearly 6,000 possible pairs. I test all 6,000 pairs for cointegration using the Johansen test for the period of 2011. This results in 110 pairs that cointegrate with a confidence level of at least 90%.

Step 3: In-sample backtests

Next, I run Moonshot backtests on all 110 cointegrating pairs for the period 2012-2015, and select the 5 best performing pairs, shown below:

Ticker1Ticker2Sharpe Ratio 2012-2015
USO - UNITED STATES OIL FUND LPDUG - PROSHARES ULTRASHORT OIL&GAS1.03
LQD - ISHARES IBOXX INVESTMENT GRAQID - PROSHARES ULTRASHORT QQQ0.96
ICF - ISHARES COHEN & STEERS REITFAZ- DIREXION DAILY FIN BEAR 3X0.71
VNQ - VANGUARD REAL ESTATE ETFFAZ - DIREXION DAILY FIN BEAR 3X0.70
XLI - INDUSTRIAL SELECT SECT SPDRIWR - ISHARES RUSSELL MID-CAP ETF0.69

Some of the pairs are intuitive if unexpected (LQD and QID) while others don't seem to make intuitive sense (USO and DUG). (Note that the presence of several leveraged ETFs in the top 5 could be problematic as these can be expensive to borrow for shorting, so it might be preferable to exclude them.)

Out-of-sample backtest of best performing pairs

Finally, I run an out-of-sample backtest on the 5 best performing pairs:

The aggregate portfolio of pairs performs well in the first two years out of sample, but then performance deteriorates, mirroring the out-of-sample equity curve of GLD/GDX. This may indicate that the out-of-sample lifespan of a well-performing pair is at most two years, implying a need to re-run cointegration tests and in-sample backtests every year or two in order to update the portfolio of best pairs.

Conclusion

Pairs that cointegrate and perform well in-sample cannot be expected to perform well out-of-sample indefinitely. However, a small period of out-of-sample cointegration lasting a year or two before deterioration sets in may be reasonable to expect. This implies that successful pairs trading requires a robust research pipeline for continually identifying and selecting new pairs to replace old pairs that stop working.

Explore this research on your own

This research was created with QuantRocket. Clone the pairs-pipeline repository to get the code and perform your own analysis.

quantrocket codeload clone 'pairs-pipeline'

QuantRocket LLC is not a financial advisor and nothing on this website or in any materials created by QuantRocket LLC should be construed as investment advice. All results are hypothetical unless otherwise noted. Past performance is not indicative of future results.

The material on this website and any other materials created by QuantRocket LLC is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantRocket LLC.

In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action. Neither QuantRocket LLC nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to QuantRocket LLC about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. QuantRocket LLC makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. Past performance is not indicative of future results.