Data Collection

The data collection process consists of collecting 1-day bars for US stocks and creating a universe of ETFs.

Collect historical data

First, we create a database for collecting 1-day bars.

In [1]:
from quantrocket.history import create_usstock_db
create_usstock_db("usstock-1d", bar_size="1 day")
Out[1]:
{'status': 'successfully created quantrocket.v2.history.usstock-1d.sqlite'}

Then collect the data:

In [2]:
from quantrocket.history import collect_history
collect_history("usstock-1d")
Out[2]:
{'status': 'the historical data will be collected asynchronously'}

Monitor flightlog for completion:

quantrocket.history: INFO [usstock-1d] Collecting US history from 2007 to present
quantrocket.history: INFO [usstock-1d] Collecting updated US securities listings
quantrocket.history: INFO [usstock-1d] Collecting additional US history from 2020-04 to present
quantrocket.history: INFO [usstock-1d] Collected 160 monthly files in quantrocket.v2.history.usstock-1d.sqlite

Define universe of US ETFs

Next we download a CSV of US ETFs:

In [3]:
from quantrocket.master import download_master_file
download_master_file("us_etfs.csv", vendors="usstock", sec_types="ETF")

Then upload the CSV to create the "us-etf" universe:

In [4]:
from quantrocket.master import create_universe
create_universe("us-etf", infilepath_or_buffer="us_etfs.csv")
Out[4]:
{'code': 'us-etf',
 'provided': 3918,
 'inserted': 3918,
 'total_after_insert': 3918}