Custom Factors

When we first looked at factors, we explored the set of built-in factors. Frequently, a desired computation isn't included as a built-in factor. One of the most powerful features of the Pipeline API is that it allows us to define our own custom factors. When a desired computation doesn't exist as a built-in, we define a custom factor.

Conceptually, a custom factor is identical to a built-in factor. It accepts inputs, window_length, and mask as constructor arguments, and returns a Factor object each day.

Let's take an example of a computation that doesn't exist as a built-in: standard deviation. To create a factor that computes the standard deviation over a trailing window, we can subclass zipline.pipeline.CustomFactor and implement a compute method whose signature is:

def compute(self, today, asset_ids, out, *inputs):
    ...
  • *inputs are M x N numpy arrays, where M is the window_length and N is the number of securities (usually around ~8000 unless a mask is provided). *inputs are trailing data windows. Note that there will be one M x N array for each BoundColumn provided in the factor's inputs list. The data type of each array will be the dtype of the corresponding BoundColumn.
  • out is an empty array of length N. out will be the output of our custom factor each day. The job of the compute method is to write output values into out.
  • asset_ids will be an integer array of length N containing security ids corresponding to the columns in our *inputs arrays.
  • today will be a pandas Timestamp representing the day for which compute is being called.

Of these, *inputs and out are most commonly used.

An instance of CustomFactor that has been added to a pipeline will have its compute method called every day. For example, let's define a custom factor that computes the standard deviation of the close price over the last 5 days. To start, let's add CustomFactor and numpy to our import statements.

In [1]:
from zipline.pipeline import Pipeline
from zipline.research import run_pipeline
from zipline.pipeline.data import EquityPricing
from zipline.pipeline.factors import SimpleMovingAverage, AverageDollarVolume
from zipline.pipeline import CustomFactor
import numpy

Next, let's define our custom factor to calculate the standard deviation over a trailing window using numpy.nanstd:

In [2]:
class StdDev(CustomFactor):
    def compute(self, today, asset_ids, out, values):
        # Calculates the column-wise standard deviation, ignoring NaNs
        out[:] = numpy.nanstd(values, axis=0)

Finally, let's instantiate our factor in make_pipeline():

In [3]:
def make_pipeline():
    std_dev = StdDev(inputs=[EquityPricing.close], window_length=5)

    return Pipeline(
        columns={
            'std_dev': std_dev
        }
    )

When this pipeline is run, StdDev.compute() will be called every day with data as follows:

  • values: An M x N numpy array, where M is 5 (window_length), and N is ~8000 (the number of securities in our database on the day in question).
  • out: An empty array of length N (~8000). In this example, the job of compute is to populate out with an array storing the 5-day close price standard deviations.
In [4]:
result = run_pipeline(make_pipeline(), start_date='2015-05-05', end_date='2015-05-05')
result
/opt/conda/envs/zipline/lib/python3.6/site-packages/numpy/lib/nanfunctions.py:1427: RuntimeWarning: Degrees of freedom <= 0 for slice.
  keepdims=keepdims)
Out[4]:
std_dev
2015-05-05 00:00:00+00:00Equity(FIBBG000C2V3D6 [A])0.268224
Equity(FIBBG00B3T3HD3 [AA])NaN
Equity(QI000000004076 [AABA])0.801359
Equity(FIBBG006T1NZ18 [AAC])1.436195
Equity(FIBBG001B9VR83 [AAC])NaN
Equity(FIBBG000V2S3P6 [AACG])0.132151
Equity(FIBBG000BDYRW6 [AADR])0.067500
Equity(FIBBG002MYG6B3 [AAIT])0.344773
Equity(FIBBG005P7Q881 [AAL])0.990123
Equity(FIBBG003PNL136 [AAMC])5.633688
Equity(FIBBG000B9XB24 [AAME])0.004714
Equity(FIBBG000D9V7T4 [AAN])0.282942
Equity(FIBBG000D6VW15 [AAOI])0.634306
Equity(FIBBG000C2LZP3 [AAON])0.344302
Equity(FIBBG000F7RCJ1 [AAP])0.876721
Equity(FIBBG008651TF3 [AAPC])0.049528
Equity(FIBBG000B9XRY4 [AAPL])1.770774
Equity(FIBBG00161BCR0 [AAT])0.596436
Equity(FIBBG000DGFSY4 [AAU])0.011889
Equity(FIBBG000C5QZ62 [AAV])0.063056
Equity(FIBBG000Q57YP0 [AAWW])4.325595
Equity(FIBBG000G6GXC5 [AAXJ])0.583383
Equity(FIBBG000BHJWG1 [AAXN])2.108379
Equity(FIBBG000B9WM03 [AB])0.490918
Equity(FIBBG000CP4WX9 [ABAX])4.385584
Equity(FIBBG000DK5Q25 [ABB])0.275942
Equity(FIBBG0025Y4RY4 [ABBV])0.796959
Equity(FIBBG000MDCQC2 [ABC])0.937772
Equity(FIBBG000CDY3H5 [ABCB])0.289095
Equity(FIBBG000Q05Q43 [ABCD])0.045869
......
Equity(FIBBG004HQMCJ4 [ZIONN])0.066212
Equity(FIBBG0043FW0J8 [ZIONO])0.070541
Equity(FIBBG000002FJ2 [ZIONP])0.058694
Equity(FIBBG000RMX9Z7 [ZIONW])0.161090
Equity(FIBBG000PQQH62 [ZIONZ])0.040825
Equity(FIBBG000FWCC57 [ZIOP])0.324136
Equity(FIBBG0019HMFX6 [ZIV])0.398768
Equity(FIBBG000H04C72 [ZIXI])0.091345
Equity(FIBBG006MJFPW3 [ZJPN])0.866606
Equity(FIBBG007XHN059 [ZLRG])0.081098
Equity(FIBBG001J2P4Y9 [ZLTQ])0.835282
Equity(FIBBG005WX1JJ7 [ZMLP])0.139857
Equity(FIBBG000RFZLM7 [ZN])0.032727
Equity(FIBBG000VD6768 [ZNGA])0.016000
Equity(FIBBG000BXQ7R1 [ZNH])0.730961
Equity(FIBBG0064MY238 [ZOES])1.113434
Equity(FIBBG006G0NHM1 [ZPIN])0.033106
Equity(FIBBG000FTMSF7 [ZQK])0.043818
Equity(FIBBG000PN8QP8 [ZROZ])2.189195
Equity(FIBBG006TL19Y0 [ZSAN])0.096830
Equity(FIBBG000F9CW36 [ZSL])2.130850
Equity(FIBBG007XHMYS1 [ZSML])0.093386
Equity(FIBBG003LFL2G1 [ZSPH])1.194481
Equity(FIBBG000BXB8X8 [ZTR])0.039771
Equity(FIBBG0039320N9 [ZTS])0.490086
Equity(FIBBG001Z7M393 [ZU])0.408566
Equity(FIBBG000PYX812 [ZUMZ])0.540126
Equity(FIBBG000C3CQP1 [ZVO])0.167021
Equity(FIBBG001NFC923 [ZX])0.004899
Equity(FIBBG00VT0KNC3 [MTCH])0.742859

8422 rows × 1 columns

Default Inputs

When writing a custom factor, we can set default inputs and window_length in our CustomFactor subclass. For example, let's define the TenDayMeanDifference custom factor to compute the mean difference between two data columns over a trailing window using numpy.nanmean. Let's set the default inputs to [EquityPricing.close, EquityPricing.open] and the default window_length to 10:

In [5]:
class TenDayMeanDifference(CustomFactor):
    # Default inputs.
    inputs = [EquityPricing.close, EquityPricing.open]
    window_length = 10
    def compute(self, today, asset_ids, out, close, open):
        # Calculates the column-wise mean difference, ignoring NaNs
        out[:] = numpy.nanmean(close - open, axis=0)

Remember in this case that `close` and `open` are each 10 x ~8000 2D numpy arrays.

If we call TenDayMeanDifference without providing any arguments, it will use the defaults.

In [6]:
# Computes the 10-day mean difference between the daily open and close prices.
close_open_diff = TenDayMeanDifference()

The defaults can be manually overridden by specifying arguments in the constructor call.

In [7]:
# Computes the 10-day mean difference between the daily high and low prices.
high_low_diff = TenDayMeanDifference(inputs=[EquityPricing.high, EquityPricing.low])

Further Example

Let's take another example where we build a momentum custom factor and use it to create a filter. We will then use that filter as a screen for our pipeline.

Let's start by defining a Momentum factor to be the division of the most recent close price by the close price from n days ago where n is the window_length.

In [8]:
class Momentum(CustomFactor):
    # Default inputs
    inputs = [EquityPricing.close]

    # Compute momentum
    def compute(self, today, assets, out, close):
        out[:] = close[-1] / close[0]

Now, let's instantiate our Momentum factor (twice) to create a 10-day momentum factor and a 20-day momentum factor. Let's also create a positive_momentum filter returning True for securities with both a positive 10-day momentum and a positive 20-day momentum.

In [9]:
ten_day_momentum = Momentum(window_length=10)
twenty_day_momentum = Momentum(window_length=20)

positive_momentum = ((ten_day_momentum > 1) & (twenty_day_momentum > 1))

Next, let's add our momentum factors and our positive_momentum filter to make_pipeline. Let's also pass positive_momentum as a screen to our pipeline.

In [10]:
def make_pipeline():

    ten_day_momentum = Momentum(window_length=10)
    twenty_day_momentum = Momentum(window_length=20)

    positive_momentum = ((ten_day_momentum > 1) & (twenty_day_momentum > 1))

    std_dev = StdDev(inputs=[EquityPricing.close], window_length=5)

    return Pipeline(
        columns={
            'std_dev': std_dev,
            'ten_day_momentum': ten_day_momentum,
            'twenty_day_momentum': twenty_day_momentum
        },
        screen=positive_momentum
    )

Running this pipeline outputs the standard deviation and each of our momentum computations for securities with positive 10-day and 20-day momentum.

In [11]:
result = run_pipeline(make_pipeline(), start_date='2015-05-05', end_date='2015-05-05')
result
/opt/conda/envs/zipline/lib/python3.6/site-packages/numpy/lib/nanfunctions.py:1427: RuntimeWarning: Degrees of freedom <= 0 for slice.
  keepdims=keepdims)
Out[11]:
std_devten_day_momentumtwenty_day_momentum
2015-05-05 00:00:00+00:00Equity(FIBBG006T1NZ18 [AAC])1.4361951.0332231.058478
Equity(FIBBG000BDYRW6 [AADR])0.0675001.0045991.027848
Equity(FIBBG000D9V7T4 [AAN])0.2829421.2179401.237892
Equity(FIBBG000B9XRY4 [AAPL])1.7707741.0141041.021348
Equity(FIBBG000C5QZ62 [AAV])0.0630561.0049501.074074
Equity(FIBBG000Q57YP0 [AAWW])4.3255951.2399001.285376
Equity(FIBBG000G6GXC5 [AAXJ])0.5833831.0097911.053033
Equity(FIBBG000BHJWG1 [AAXN])2.1083791.1423731.373828
Equity(FIBBG000B9WM03 [AB])0.4909181.0361371.068249
Equity(FIBBG0025Y4RY4 [ABBV])0.7969591.0168211.107155
Equity(FIBBG000MDCQC2 [ABC])0.9377721.0119411.022808
Equity(FIBBG005YTXRH3 [ABCW])0.6660811.0724051.069606
Equity(FIBBG000BN5VZ4 [ABEV])0.0545531.0064211.006421
Equity(FIBBG000B9YYH7 [ABM])0.0978571.0093691.015713
Equity(FIBBG000KMVDV1 [ABR])0.0560001.0101011.005747
Equity(FIBBG00610P7D7 [ABRPC])0.0606181.0034941.003104
Equity(FIBBG000BP23H4 [ACFC])0.0731031.0019081.014493
Equity(FIBBG000CMRVH1 [ACH])0.5081891.0950261.241504
Equity(FIBBG000PMBV39 [ACIW])0.4183111.0276791.067718
Equity(FIBBG0017VSC04 [ACP])0.0778721.0032941.002633
Equity(FIBBG004XQWCG8 [ACSF])0.0508331.0037451.001495
Equity(FIBBG000TH6VB3 [ACWI])0.3289131.0082351.024278
Equity(FIBBG0025X38X0 [ACWV])0.4206841.0040891.016980
Equity(FIBBG000TH7DF8 [ACWX])0.2771571.0089691.028515
Equity(FIBBG000BB5006 [ADBE])0.2932171.0164841.002644
Equity(FIBBG000B9WLK3 [ADGE])0.0133511.0452261.205797
Equity(FIBBG000BB6WG8 [ADM])0.6051911.0496451.044850
Equity(FIBBG000JG0547 [ADP])0.6575291.0164511.007513
Equity(FIBBG006JS1NR3 [ADPT])1.1079241.1769671.341555
Equity(FIBBG000PK5975 [ADRA])0.2590831.0088411.036665
............
Equity(FIBBG000JYTS29 [XPP])1.8519401.0275211.256679
Equity(FIBBG000BX57K1 [XRAY])0.5762151.0136911.038317
Equity(FIBBG000QZ5846 [XRDC])0.0778201.0438251.052209
Equity(FIBBG000R263Y5 [XSPA])0.0448661.0344831.028571
Equity(FIBBG0017G4VC8 [XUE])0.0231521.0059701.123333
Equity(FIBBG000C1J0X6 [XXIA])0.1821431.0137991.017101
Equity(FIBBG001D8R5D0 [XYL])0.2782371.0435521.023011
Equity(FIBBG000PNT1F1 [YAO])0.2376741.0254491.128612
Equity(FIBBG000CBT081 [YDLE])0.1412231.0321541.007059
Equity(FIBBG006WVJY34 [YGRO])0.0708801.0246691.050586
Equity(FIBBG000PY4FX3 [YINN])1.7331521.0363441.387613
Equity(FIBBG00440GH65 [YMLI])0.0688461.0102391.028390
Equity(QI000000137843 [YMLP])0.0655441.0143771.064543
Equity(FIBBG001924FR6 [YOKU])0.6975791.2068551.442847
Equity(FIBBG000BRZKC1 [YORW])0.3629551.0296231.013328
Equity(FIBBG000BHPFQ0 [YPF])0.1742871.0680721.033434
Equity(FIBBG000BH3GZ2 [YUM])2.5326461.1256651.162473
Equity(FIBBG007BTZPK2 [YUMAPA])0.1671411.0216011.044156
Equity(FIBBG003H0XV18 [YY])2.8334121.1455381.250090
Equity(FIBBG002ZM63J5 [ZAYO])0.4915041.0274521.022587
Equity(FIBBG000BBFT75 [ZEUS])1.9585361.4347431.257098
Equity(FIBBG000K14VN6 [ZINC])0.0949531.0744221.181096
Equity(FIBBG000BX9WL1 [ZION])0.2755001.0564031.056597
Equity(FIBBG0058QYKL7 [ZIONL])0.0388661.0035841.010101
Equity(FIBBG0043FW0J8 [ZIONO])0.0705411.0112781.024372
Equity(FIBBG000RMX9Z7 [ZIONW])0.1610901.1184211.112565
Equity(FIBBG0019HMFX6 [ZIV])0.3987681.0164081.054225
Equity(FIBBG000H04C72 [ZIXI])0.0913451.0726391.135897
Equity(FIBBG005WX1JJ7 [ZMLP])0.1398571.0123491.037026
Equity(FIBBG000RFZLM7 [ZN])0.0327271.0531911.076087

2662 rows × 3 columns

Custom factors allow us to define custom computations in a pipeline. They are frequently the best way to perform computations on multiple data columns. The full documentation for CustomFactors is available in the API Reference.