Collect US Stock Data

For the sake of convenience, before collecting the full dataset, we recommend collecting the free sample dataset first. Having it available will allow you to execute the example data queries as written, without having to modify them to point to the full dataset.

The US Stock data bundle contains both minute and daily data, but most of the lectures only utilize the daily data. For the few lectures that do use minute data, the free data bundle provides adequate coverage. Collecting the full dataset with minute data takes approximately 12-15 hours, versus only a minute for the daily data, so in this notebook we will only collect the daily portion of the bundle. However, the full minute dataset can also be used for the lectures.

Once you have collected the sample data, create a bundle for the full dataset called 'usstock-1d-bundle':

In [1]:
from quantrocket.zipline import create_usstock_bundle
create_usstock_bundle("usstock-1d-bundle", data_frequency="daily")
Out[1]:
{'msg': 'successfully created usstock-1d-bundle bundle', 'status': 'success'}

Then ingest the data:

In [2]:
from quantrocket.zipline import ingest_bundle
ingest_bundle("usstock-1d-bundle")
Out[2]:
{'status': 'the data will be ingested asynchronously'}

Use flightlog to monitor the progress:

quantrocket.zipline: INFO [usstock-1d-bundle] Ingesting daily bars for usstock-1d-bundle bundle
quantrocket.zipline: INFO [usstock-1d-bundle] Ingesting adjustments for usstock-1d-bundle bundle
quantrocket.zipline: INFO [usstock-1d-bundle] Ingesting assets for usstock-1d-bundle bundle
quantrocket.zipline: INFO [usstock-1d-bundle] Completed ingesting data for 18638 securities in usstock-1d-bundle bundle