Factors

A factor is a function from an asset and a moment in time to a number.

F(asset, timestamp) -> float

In Pipeline, Factors are the most commonly-used term, representing the result of any computation producing a numerical result. Factors require a column of data and a window length as input.

The simplest factors in Pipeline are built-in Factors. Built-in Factors are pre-built to perform common computations. As a first example, let's make a factor to compute the average close price over the last 10 days. We can use the SimpleMovingAverage built-in factor which computes the average value of the input data (close price) over the specified window length (10 days). To do this, we need to import our built-in SimpleMovingAverage factor and the EquityPricing dataset.

In [1]:
from zipline.pipeline import Pipeline
from zipline.research import run_pipeline

# New from the last lesson, import the EquityPricing dataset.
from zipline.pipeline.data import EquityPricing

# New from the last lesson, import the built-in SimpleMovingAverage factor.
from zipline.pipeline.factors import SimpleMovingAverage

Creating a Factor

Let's go back to our make_pipeline function from the previous lesson and instantiate a SimpleMovingAverage factor. To create a SimpleMovingAverage factor, we can call the SimpleMovingAverage constructor with two arguments: inputs, which must be a list of BoundColumn objects, and window_length, which must be an integer indicating how many days worth of data our moving average calculation should receive. (We'll discuss BoundColumn in more depth later; for now we just need to know that a BoundColumn is an object indicating what kind of data should be passed to our Factor.).

The following line creates a Factor for computing the 10-day mean close price of securities.

In [2]:
mean_close_10 = SimpleMovingAverage(inputs=[EquityPricing.close], window_length=10)

It's important to note that creating the factor does not actually perform a computation. Creating a factor is like defining the function. To perform a computation, we need to add the factor to our pipeline and run it.

Adding a Factor to a Pipeline

Let's update our original empty pipeline to make it compute our new moving average factor. To start, let's move our factor instantiation into make_pipeline. Next, we can tell our pipeline to compute our factor by passing it a columns argument, which should be a dictionary mapping column names to factors, filters, or classifiers. Our updated make_pipeline function should look something like this:

In [3]:
def make_pipeline():
    
    mean_close_10 = SimpleMovingAverage(inputs=[EquityPricing.close], window_length=10)
    
    return Pipeline(
        columns={
            '10_day_mean_close': mean_close_10
        }
    )

To see what this looks like, let's make our pipeline, run it, and display the result.

In [4]:
result = run_pipeline(make_pipeline(), start_date='2015-05-05', end_date='2015-05-05')
result
Out[4]:
10_day_mean_close
2015-05-05 00:00:00+00:00Equity(FIBBG000C2V3D6 [A])42.197000
Equity(FIBBG00B3T3HD3 [AA])NaN
Equity(QI000000004076 [AABA])43.577500
Equity(FIBBG006T1NZ18 [AAC])32.953000
Equity(FIBBG001B9VR83 [AAC])NaN
Equity(FIBBG000V2S3P6 [AACG])4.548800
Equity(FIBBG000BDYRW6 [AADR])40.508750
Equity(FIBBG002MYG6B3 [AAIT])37.018889
Equity(FIBBG005P7Q881 [AAL])50.548000
Equity(FIBBG003PNL136 [AAMC])222.020000
Equity(FIBBG000B9XB24 [AAME])3.962500
Equity(FIBBG000D9V7T4 [AAN])32.612000
Equity(FIBBG000D6VW15 [AAOI])14.596000
Equity(FIBBG000C2LZP3 [AAON])24.484000
Equity(FIBBG000F7RCJ1 [AAP])146.715000
Equity(FIBBG008651TF3 [AAPC])10.141000
Equity(FIBBG000B9XRY4 [AAPL])129.013000
Equity(FIBBG00161BCR0 [AAT])41.011000
Equity(FIBBG000DGFSY4 [AAU])0.812100
Equity(FIBBG000C5QZ62 [AAV])6.054000
Equity(FIBBG000Q57YP0 [AAWW])45.782000
Equity(FIBBG000G6GXC5 [AAXJ])69.092000
Equity(FIBBG000BHJWG1 [AAXN])29.994000
Equity(FIBBG000B9WM03 [AB])31.562000
Equity(FIBBG000CP4WX9 [ABAX])61.922000
Equity(FIBBG000DK5Q25 [ABB])21.878000
Equity(FIBBG0025Y4RY4 [ABBV])64.925000
Equity(FIBBG000MDCQC2 [ABC])114.305000
Equity(FIBBG000CDY3H5 [ABCB])25.179000
Equity(FIBBG000Q05Q43 [ABCD])3.016000
......
Equity(FIBBG004HQMCJ4 [ZIONN])24.788000
Equity(FIBBG0043FW0J8 [ZIONO])26.912000
Equity(FIBBG000002FJ2 [ZIONP])22.960300
Equity(FIBBG000RMX9Z7 [ZIONW])3.951667
Equity(FIBBG000PQQH62 [ZIONZ])2.450000
Equity(FIBBG000FWCC57 [ZIOP])9.998500
Equity(FIBBG0019HMFX6 [ZIV])44.570500
Equity(FIBBG000H04C72 [ZIXI])4.360000
Equity(FIBBG006MJFPW3 [ZJPN])68.095778
Equity(FIBBG007XHN059 [ZLRG])25.989750
Equity(FIBBG001J2P4Y9 [ZLTQ])32.596000
Equity(FIBBG005WX1JJ7 [ZMLP])33.604900
Equity(FIBBG000RFZLM7 [ZN])1.971400
Equity(FIBBG000VD6768 [ZNGA])2.482500
Equity(FIBBG000BXQ7R1 [ZNH])50.199000
Equity(FIBBG0064MY238 [ZOES])32.833000
Equity(FIBBG006G0NHM1 [ZPIN])15.725000
Equity(FIBBG000FTMSF7 [ZQK])1.701000
Equity(FIBBG000PN8QP8 [ZROZ])120.207000
Equity(FIBBG006TL19Y0 [ZSAN])9.111000
Equity(FIBBG000F9CW36 [ZSL])105.482300
Equity(FIBBG007XHMYS1 [ZSML])27.217000
Equity(FIBBG003LFL2G1 [ZSPH])40.535000
Equity(FIBBG000BXB8X8 [ZTR])13.672900
Equity(FIBBG0039320N9 [ZTS])46.174000
Equity(FIBBG001Z7M393 [ZU])12.982500
Equity(FIBBG000PYX812 [ZUMZ])32.971000
Equity(FIBBG000C3CQP1 [ZVO])9.088000
Equity(FIBBG001NFC923 [ZX])1.343000
Equity(FIBBG00VT0KNC3 [MTCH])71.742500

8422 rows × 1 columns

Now we have a column in our pipeline output with the 10-day average close price for all 8000+ securities (display truncated). Note that each row corresponds to the result of our computation for a given security on a given date stored. The DataFrame has a MultiIndex where the first level is a datetime representing the date of the computation and the second level is an Equity object corresponding to the security.

If we run our pipeline over more than one day, the output looks like this.

In [5]:
result = run_pipeline(make_pipeline(), start_date='2015-05-05', end_date='2015-05-07')
result
Out[5]:
10_day_mean_close
2015-05-05 00:00:00+00:00Equity(FIBBG000C2V3D6 [A])42.197000
Equity(FIBBG00B3T3HD3 [AA])NaN
Equity(QI000000004076 [AABA])43.577500
Equity(FIBBG006T1NZ18 [AAC])32.953000
Equity(FIBBG001B9VR83 [AAC])NaN
Equity(FIBBG000V2S3P6 [AACG])4.548800
Equity(FIBBG000BDYRW6 [AADR])40.508750
Equity(FIBBG002MYG6B3 [AAIT])37.018889
Equity(FIBBG005P7Q881 [AAL])50.548000
Equity(FIBBG003PNL136 [AAMC])222.020000
Equity(FIBBG000B9XB24 [AAME])3.962500
Equity(FIBBG000D9V7T4 [AAN])32.612000
Equity(FIBBG000D6VW15 [AAOI])14.596000
Equity(FIBBG000C2LZP3 [AAON])24.484000
Equity(FIBBG000F7RCJ1 [AAP])146.715000
Equity(FIBBG008651TF3 [AAPC])10.141000
Equity(FIBBG000B9XRY4 [AAPL])129.013000
Equity(FIBBG00161BCR0 [AAT])41.011000
Equity(FIBBG000DGFSY4 [AAU])0.812100
Equity(FIBBG000C5QZ62 [AAV])6.054000
Equity(FIBBG000Q57YP0 [AAWW])45.782000
Equity(FIBBG000G6GXC5 [AAXJ])69.092000
Equity(FIBBG000BHJWG1 [AAXN])29.994000
Equity(FIBBG000B9WM03 [AB])31.562000
Equity(FIBBG000CP4WX9 [ABAX])61.922000
Equity(FIBBG000DK5Q25 [ABB])21.878000
Equity(FIBBG0025Y4RY4 [ABBV])64.925000
Equity(FIBBG000MDCQC2 [ABC])114.305000
Equity(FIBBG000CDY3H5 [ABCB])25.179000
Equity(FIBBG000Q05Q43 [ABCD])3.016000
.........
2015-05-07 00:00:00+00:00Equity(FIBBG004HQMCJ4 [ZIONN])24.685000
Equity(FIBBG0043FW0J8 [ZIONO])26.864000
Equity(FIBBG000002FJ2 [ZIONP])22.922000
Equity(FIBBG000RMX9Z7 [ZIONW])4.065714
Equity(FIBBG000PQQH62 [ZIONZ])2.476000
Equity(FIBBG000FWCC57 [ZIOP])9.713500
Equity(FIBBG0019HMFX6 [ZIV])44.589000
Equity(FIBBG000H04C72 [ZIXI])4.387000
Equity(FIBBG006MJFPW3 [ZJPN])67.603222
Equity(FIBBG007XHN059 [ZLRG])25.989750
Equity(FIBBG001J2P4Y9 [ZLTQ])31.922000
Equity(FIBBG005WX1JJ7 [ZMLP])33.604400
Equity(FIBBG000RFZLM7 [ZN])1.987400
Equity(FIBBG000VD6768 [ZNGA])2.493500
Equity(FIBBG000BXQ7R1 [ZNH])48.724000
Equity(FIBBG0064MY238 [ZOES])32.498000
Equity(FIBBG006G0NHM1 [ZPIN])15.543000
Equity(FIBBG000FTMSF7 [ZQK])1.676000
Equity(FIBBG000PN8QP8 [ZROZ])118.084000
Equity(FIBBG006TL19Y0 [ZSAN])9.132000
Equity(FIBBG000F9CW36 [ZSL])103.740400
Equity(FIBBG007XHMYS1 [ZSML])27.013143
Equity(FIBBG003LFL2G1 [ZSPH])40.606000
Equity(FIBBG000BXB8X8 [ZTR])13.639900
Equity(FIBBG0039320N9 [ZTS])45.814000
Equity(FIBBG001Z7M393 [ZU])12.546000
Equity(FIBBG000PYX812 [ZUMZ])32.439000
Equity(FIBBG000C3CQP1 [ZVO])9.064000
Equity(FIBBG001NFC923 [ZX])1.343100
Equity(FIBBG00VT0KNC3 [MTCH])71.941500

25274 rows × 1 columns

Note: factors can also be added to an existing Pipeline instance using the Pipeline.add method. Using add looks something like this:

my_pipe = Pipeline()
f1 = SomeFactor(...)
my_pipe.add(f1, 'f1')

Latest

The most commonly used built-in Factor is Latest. The Latest factor gets the most recent value of a given data column. This factor is common enough that it is instantiated differently from other factors. The best way to get the latest value of a data column is by getting its .latest attribute. As an example, let's update make_pipeline to create a latest close price factor and add it to our pipeline:

In [6]:
def make_pipeline():

    mean_close_10 = SimpleMovingAverage(inputs=[EquityPricing.close], window_length=10)
    latest_close = EquityPricing.close.latest

    return Pipeline(
        columns={
            '10_day_mean_close': mean_close_10,
            'latest_close_price': latest_close
        }
    )

And now, when we make and run our pipeline again, there are two columns in our output dataframe. One column has the 10-day mean close price of each security, and the other has the latest close price.

In [7]:
result = run_pipeline(make_pipeline(), start_date='2015-05-05', end_date='2015-05-05')
result.head(5)
Out[7]:
10_day_mean_closelatest_close_price
2015-05-05 00:00:00+00:00Equity(FIBBG000C2V3D6 [A])42.197041.94
Equity(FIBBG00B3T3HD3 [AA])NaNNaN
Equity(QI000000004076 [AABA])43.577542.04
Equity(FIBBG006T1NZ18 [AAC])32.953034.21
Equity(FIBBG001B9VR83 [AAC])NaNNaN

.latest can sometimes return things other than Factors. We'll see examples of other possible return types in later lessons.

Default Inputs

Some factors have default inputs that should never be changed. For example the VWAP built-in factor is always calculated from EquityPricing.close and EquityPricing.volume. When a factor is always calculated from the same BoundColumn, we can call the constructor without specifying inputs.

In [8]:
from zipline.pipeline.factors import VWAP
vwap = VWAP(window_length=10)

Next Lesson: Combining Factors