Data Collection

This tutorial uses the US Stock data bundle. The Pipeline API runs on daily data, so you can ingest the full minute bundle or only the daily portion of the bundle. We will illustrate the latter.

Create an empty bundle called 'usstock-1d-bundle':

In [1]:
from quantrocket.zipline import create_usstock_bundle
create_usstock_bundle("usstock-1d-bundle", data_frequency="daily")
Out[1]:
{'msg': 'successfully created usstock-1d-bundle bundle', 'status': 'success'}

Then ingest the data:

In [2]:
from quantrocket.zipline import ingest_bundle
ingest_bundle("usstock-1d-bundle")
Out[2]:
{'status': 'the data will be ingested asynchronously'}

Use flightlog to monitor the progress:

quantrocket.zipline: INFO [usstock-1d-bundle] Ingesting daily bars for usstock-1d-bundle bundle
quantrocket.zipline: INFO [usstock-1d-bundle] Ingesting adjustments for usstock-1d-bundle bundle
quantrocket.zipline: INFO [usstock-1d-bundle] Ingesting assets for usstock-1d-bundle bundle
quantrocket.zipline: INFO [usstock-1d-bundle] Completed ingesting data for 18638 securities in usstock-1d-bundle bundle

Next Lesson: Why Pipeline?