End-of-Day Analysis with Alphalens

Alphalens is an open-source performance analysis library which pairs well with the Pipeline API. In this notebook we will use Alphalens to analyze whether our momentum factor is predictive of forward returns.

Using Alphalens makes sense when you believe your end-of-day Pipeline rules have alpha. In contrast, if your Pipeline rules simply perform a basic screen and the alpha is entirely provided by your intraday trading rules, it might make more sense to omit this step.

Let's re-run our pipeline from the previous notebook:

In [ ]:
from zipline.pipeline import Pipeline
from zipline.pipeline.factors import AverageDollarVolume, Returns
from zipline.research import run_pipeline

pipeline = Pipeline(
    columns={
        "1y_returns": Returns(window_length=252),
    },
    screen=AverageDollarVolume(window_length=30) > 10e6
)

factors = run_pipeline(pipeline, start_date="2017-01-01", end_date="2019-01-01")

To see if our momentum factor is predictive of forward returns, we use the factor data to request forward returns for the corresponding assets and dates, then format the factor and returns data for use with Alphalens:

In [2]:
from zipline.research import get_forward_returns
import alphalens as al

# Get forward returns (this provides forward 1-day returns by default)
forward_returns = get_forward_returns(factors)

# format the data for Alphalens
al_data = al.utils.get_clean_factor(
    factors["1y_returns"], 
    forward_returns, 
    quantiles=2 # For a very small sample universe, you might only want 2 quantiles 
)
/opt/conda/envs/zipline/lib/python3.6/site-packages/statsmodels/compat/pandas.py:56: FutureWarning: The pandas.core.datetools module is deprecated and will be removed in a future version. Please use the pandas.tseries module instead.
  from pandas.core import datetools
Dropped 0.2% entries from factor data: 0.2% in forward returns computation and 0.0% in binning phase (set max_loss=0 to see potentially suppressed Exceptions).
max_loss is 35.0%, not exceeded: OK!

Then we create a tear sheet to look at the factor. For a predictive factor, the higher quantiles should perform better than the lower quantiles.

In [3]:
from alphalens.tears import create_full_tear_sheet
create_full_tear_sheet(al_data)
Quantiles Statistics
minmaxmeanstdcountcount %
factor_quantile
1-0.5086810.4785200.1023970.136901186455.311573
2-0.1404154.4464490.7156440.897937150644.688427
Returns Analysis
1D
Ann. alpha0.088
beta0.309
Mean Period Wise Return Top Quantile (bps)1.597
Mean Period Wise Return Bottom Quantile (bps)-1.075
Mean Period Wise Spread (bps)2.672
/opt/conda/envs/zipline/lib/python3.6/site-packages/seaborn/categorical.py:647: FutureWarning: remove_na is deprecated and is a private function. Do not use.
  kde_data = remove_na(group_data[hue_mask])
/opt/conda/envs/zipline/lib/python3.6/site-packages/seaborn/categorical.py:942: FutureWarning: remove_na is deprecated and is a private function. Do not use.
  violin_data = remove_na(group_data[hue_mask])
<matplotlib.figure.Figure at 0x7fade5b7c978>
Information Analysis
1D
IC Mean0.034
IC Std.0.468
Risk-Adjusted IC0.072
t-stat(IC)1.617
p-value(IC)0.106
IC Skew-0.074
IC Kurtosis-0.857
/opt/conda/envs/zipline/lib/python3.6/site-packages/statsmodels/nonparametric/kde.py:475: DeprecationWarning: object of type <class 'numpy.float64'> cannot be safely interpreted as an integer.
  grid,delta = np.linspace(a,b,gridsize,retstep=True)
/opt/conda/envs/zipline/lib/python3.6/site-packages/alphalens/utils.py:912: UserWarning: Skipping return periods that aren't exact multiples of days.
  + " of days."
Turnover Analysis
1D
Quantile 1 Mean Turnover0.037
Quantile 2 Mean Turnover0.045
1D
Mean Factor Rank Autocorrelation0.98