Interactive Strategy Development¶

We'll develop our strategy logic interactively in a notebook, peeking at the DataFrames as we go, then transfer the code to a .py file for backtesting with Moonshot.

As a reminder, here are the rules of the QVAL strategy as outlined in the Alpha Architect white paper:

Universe selection
1. Starting universe: all NYSE stocks
2. Exclude financials, ADRs, REITs
3. Liquidity screen: select top N percent of stocks by market cap (N=60)
Apply value screen: select cheapest N percent of stocks by enterprise multiple (EV/EBIT) (N=10)
Rank by quality: of the value stocks, select the N percent with the highest quality, as ranked by Piotroski F-Score (N=50)
Apply equal weights
Rebalance portfolio quarterly

Query historical prices¶

Start by querying historical prices from your Sharadar history database. We specify our universe of NYSE stocks as well as the universes we wish to exclude.

For now we limit ourselves to a couple years of data to make it easier to work with. Later we'll run a backtest using a larger date range.

from quantrocket import get_prices

DB = "sharadar-us-stk-1d"
UNIVERSES = "nyse-stk"
EXCLUDE_UNIVERSES = ["nyse-financials", "nyse-reits", "nyse-adrs"]

prices = get_prices(DB, 
                    start_date="2014-01-01",
                    end_date="2016-01-01", 
                    universes=UNIVERSES,
                    exclude_universes=EXCLUDE_UNIVERSES, 
                    fields=["Close", "Volume"])
prices.tail()

Step 1.C: Filter by dollar volume¶

The QVAL white paper calls for limiting the universe to the top 60% of stocks by market cap. Although market cap is available in the Sharadar fundamental data, here we will use dollar volume as a proxy for market cap.

The code below will compute daily ranks by dollar volume and give us a boolean mask indicating which stocks have adequate dollar volume.

closes = prices.loc["Close"]
volumes = prices.loc["Volume"]

# calculate 90 day average dollar volume
avg_dollar_volumes = (closes * volumes).rolling(90).mean()

# rank biggest to smallest; pct=True gives percentile ranks between 0-1
dollar_volume_ranks = avg_dollar_volumes.rank(axis=1, ascending=False, pct=True)
        
have_adequate_dollar_volumes = dollar_volume_ranks <= (0.60)
have_adequate_dollar_volumes.tail()

We'll use this filter in the next step.

Step 2: Apply value screen¶

Next, we use Sharadar fundamentals to identify the cheapest 10% of stocks as measured by the enterprise multiple (EV/EBIT).

The function get_sharadar_fundamentals_reindexed_like returns a DataFrame of fundamentals matching the dates and sids (security IDs) of the input DataFrame (in this case closes). The resulting DataFrame gives us the latest fundamental values as of any given date.

from quantrocket.fundamental import get_sharadar_fundamentals_reindexed_like

# Request EV/EBIT. The dimension "ART" (= "As reported - trailing twelve months") excludes restatements.
fundamentals = get_sharadar_fundamentals_reindexed_like(closes, fields=["EVEBIT"], dimension="ART")
enterprise_multiples = fundamentals.loc["EVEBIT"]

# Ignore negative enterprise multiples, which indicate negative earnings
enterprise_multiples = enterprise_multiples.where(enterprise_multiples > 0)

enterprise_multiples.tail()

We identify value stocks by ranking on enterprise multiple, but we only apply the rankings to stocks with adequate dollar volume:

value_ranks = enterprise_multiples.where(have_adequate_dollar_volumes).rank(axis=1, ascending=True, pct=True)
are_value_stocks = value_ranks <= (0.10)

Step 3: Filter by quality¶

The next step is to rank the value stocks by quality and select the top 50%. To do this we must calculate the Piotroski F-Score.

Piotroski F-Score¶

First, we query the relevant indicators to calculate the F-Score:

fundamentals = get_sharadar_fundamentals_reindexed_like(
    closes,
    dimension="ART", # As-reported trailing twelve months reports
    fields=[
        "ROA", # Return on assets
        "ASSETS", # Total Assets
        "NCFO", # Net Cash Flow from Operations
        "DE", # Debt to Equity Ratio
        "CURRENTRATIO", # Current ratio
        "SHARESWA", # Outstanding shares
        "GROSSMARGIN", # Gross margin
        "ASSETTURNOVER", # Asset turnover
])

return_on_assets = fundamentals.loc["ROA"]
total_assets = fundamentals.loc["ASSETS"]
operating_cash_flows = fundamentals.loc["NCFO"]
leverages = fundamentals.loc["DE"]
current_ratios = fundamentals.loc["CURRENTRATIO"]
shares_out = fundamentals.loc["SHARESWA"]
gross_margins = fundamentals.loc["GROSSMARGIN"]
asset_turnovers = fundamentals.loc["ASSETTURNOVER"]

Many Piotroski F-score components compare current to previous values, so next we need to get DataFrames of the previous period's values. To do this in pandas, we identify which dates represent new fiscal periods, shift the previous period's value into the new period, and fill forward:

fundamentals = get_sharadar_fundamentals_reindexed_like(
    closes,
    dimension="ART", # As-reported trailing twelve month reports
    fields=["REPORTPERIOD"])

# get a boolean mask of the first day of each newly reported fiscal period
fiscal_periods = fundamentals.loc["REPORTPERIOD"]
are_new_fiscal_periods = fiscal_periods != fiscal_periods.shift()

# shift the ROAs forward one fiscal period by (1) shifting the ratios one day,
# (2) keeping only the ones that fall on the first day of the newly reported
# fiscal period, and (3) forward-filling
previous_return_on_assets = return_on_assets.shift().where(are_new_fiscal_periods).fillna(method="ffill")

# Step 2.c: Repeat for other indicators
previous_leverages = leverages.shift().where(are_new_fiscal_periods).fillna(method="ffill")
previous_current_ratios = current_ratios.shift().where(are_new_fiscal_periods).fillna(method="ffill")
previous_shares_out = shares_out.shift().where(are_new_fiscal_periods).fillna(method="ffill")
previous_gross_margins = gross_margins.shift().where(are_new_fiscal_periods).fillna(method="ffill")
previous_asset_turnovers = asset_turnovers.shift().where(are_new_fiscal_periods).fillna(method="ffill")

Now we can calculate the F-Score components; each resulting component is a DataFrame of booleans:

have_positive_return_on_assets = return_on_assets > 0
have_positive_operating_cash_flows = operating_cash_flows > 0
have_increasing_return_on_assets = return_on_assets > previous_return_on_assets
have_more_cash_flow_than_incomes = operating_cash_flows / total_assets > return_on_assets
have_decreasing_leverages = leverages < previous_leverages
have_increasing_current_ratios = current_ratios > previous_current_ratios
have_no_new_shares = shares_out <= previous_shares_out
have_increasing_gross_margins = gross_margins > previous_gross_margins
have_increasing_asset_turnovers = asset_turnovers > previous_asset_turnovers

Lastly, we convert the booleans to integers and sum to get the F-Score, which is a number between 0 and 9:

f_scores = (
    have_positive_return_on_assets.astype(int)
    + have_positive_operating_cash_flows.astype(int)
    + have_increasing_return_on_assets.astype(int)
    + have_more_cash_flow_than_incomes.astype(int)
    + have_decreasing_leverages.astype(int)
    + have_increasing_current_ratios.astype(int)
    + have_no_new_shares.astype(int)
    + have_increasing_gross_margins.astype(int)
    + have_increasing_asset_turnovers.astype(int)
)
f_scores.tail()

Rank by F-Score¶

Now that we have the F-scores we can rank the value stocks by quality and select the top 50%. This gives us our DataFrame of long signals, which by convention we cast from boolean to int.

quality_ranks = f_scores.where(are_value_stocks).rank(axis=1, ascending=False, pct=True)
long_signals = quality_ranks <= (0.50)
long_signals = long_signals.astype(int)

Step 4: Apply equal weights¶

The QVAL strategy trades an equal-weighted portfolio. By convention, for an unlevered strategy the daily weights should add up to 1 (=100% invested), so we divide each day's signals by the number of signals to get the individual position weights:

daily_signal_counts = long_signals.abs().sum(axis=1)
daily_signal_counts.tail()

Date
2015-12-24    49
2015-12-28    49
2015-12-29    49
2015-12-30    49
2015-12-31    49
dtype: int64

weights = long_signals.div(daily_signal_counts, axis=0).fillna(0)
weights.where(weights!=0).stack().tail()

Date        Sid           
2015-12-31  FIBBG0029SNR63    0.020408
            FIBBG002B917C3    0.020408
            FIBBG00BN961G4    0.020408
            FIBBG00GTKSSK4    0.020408
            FIBBG00R2NHQ65    0.020408
dtype: float64

Step 5: Rebalance quarterly¶

Currently we have a DataFrame of signals that change every day, but the QVAL strategy calls for quarterly rebalancing.

To accomplish this with pandas, we resample the DataFrame to quarterly, take the last signal of the quarter, then reindex back to daily and fill forward:

# Resample daily to quarterly, taking the last day's signal
# For pandas offset aliases, see https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases
weights = weights.resample("Q").last()

# Reindex back to daily and fill forward
weights = weights.reindex(closes.index, method="ffill")

Step 6: Positions and returns¶

The DataFrame of weights represents what we want to own, as calculated at the end of the day. Assuming we enter positions the next day, we simply shift the weights forward to simulate our positions:

positions = weights.shift()

To calculate the return (before costs), we multiply the security's percent change over the period by the size of the position.

Since positions represents when we enter the position, we must shift positions forward to get the "end" of the position, since that is when we collect the percent change, not when we first enter the position.

position_ends = positions.shift()
gross_returns = closes.pct_change() * position_ends

TIP: Proper alignment is important to avoid lookahead bias. Inspect your DataFrames as you go to validate your logic. If your DataFrames are large, as in this strategy, creating some toy DataFrames can be a good way to see what's happening:

import pandas as pd
toy_closes = pd.Series([50,100,50])
toy_pct_changes = toy_closes.pct_change()
toy_positions = pd.Series([0,1,0])
toy_position_ends = toy_positions.shift()
toy_returns = toy_pct_changes * toy_position_ends
pd.concat({
    "close": toy_closes,
    "pct_change": toy_pct_changes,
    "position": toy_positions,
    "position_ends": toy_position_ends,
    "return": toy_returns}, 
    axis=1)

Next Up¶

Part 4: Moonshot Backtest

	Sid	FIBBG0000018G2	FIBBG000001J87	FIBBG000001JC2	FIBBG000001JD1	FIBBG000001NT5	FIBBG000001NV2	FIBBG000001SF9	FIBBG000002791	FIBBG0000027B8	FIBBG000002WJ5	...	QA000000001978	QA000000001981	QA000000001995	QA000000014708	QA000000014977	QA000000017129	QA000000018169	QA000000020127	QA000000021599	QA000000021660
Field	Date
Volume	2015-12-24	344.0	0.0	0.0	0.0	0.0	0.0	5.0	367.0	275.0	0.0	...	NaN	NaN	NaN	NaN	151783.0	44931.0	10039591.0	198010.0	19740.0	2487841.0
	2015-12-28	0.0	0.0	110.0	0.0	0.0	0.0	2.0	1200.0	2305.0	645.0	...	NaN	NaN	NaN	NaN	190260.0	144889.0	22684172.0	499439.0	21078.0	4331514.0
	2015-12-29	970.0	0.0	0.0	0.0	0.0	4404.0	20.0	200.0	1789.0	0.0	...	NaN	NaN	NaN	NaN	243947.0	148636.0	26242804.0	478452.0	29483.0	5834458.0
	2015-12-30	6.0	0.0	1010.0	0.0	0.0	140.0	0.0	9781.0	512.0	0.0	...	NaN	NaN	NaN	NaN	301970.0	120767.0	30679962.0	344818.0	22549.0	4393923.0
	2015-12-31	15.0	0.0	1000.0	0.0	3.0	0.0	50.0	8824.0	849.0	0.0	...	NaN	NaN	NaN	NaN	304484.0	167048.0	33392960.0	261484.0	21686.0	6288810.0

Sid	FIBBG0000018G2	FIBBG000001J87	FIBBG000001JC2	FIBBG000001JD1	FIBBG000001NT5	FIBBG000001NV2	FIBBG000001SF9	FIBBG000002791	FIBBG0000027B8	FIBBG000002WJ5	...	QA000000001978	QA000000001981	QA000000001995	QA000000014708	QA000000014977	QA000000017129	QA000000018169	QA000000020127	QA000000021599	QA000000021660
Date
2015-12-24	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	38.0	20.0	NaN	49.0	8.0	16.0
2015-12-28	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	38.0	20.0	NaN	49.0	8.0	16.0
2015-12-29	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	38.0	20.0	NaN	49.0	8.0	16.0
2015-12-30	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	38.0	20.0	NaN	49.0	8.0	16.0
2015-12-31	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	NaN	NaN	38.0	20.0	NaN	49.0	8.0	16.0

Sid	FIBBG0000018G2	FIBBG000001J87	FIBBG000001JC2	FIBBG000001JD1	FIBBG000001NT5	FIBBG000001NV2	FIBBG000001SF9	FIBBG000002791	FIBBG0000027B8	FIBBG000002WJ5	...	QA000000001978	QA000000001981	QA000000001995	QA000000014708	QA000000014977	QA000000017129	QA000000018169	QA000000020127	QA000000021599	QA000000021660
Date
2015-12-24	0	0	0	0	0	0	0	0	0	0	...	0	0	0	5	1	5	2	7	7	5
2015-12-28	0	0	0	0	0	0	0	0	0	0	...	0	0	0	5	1	5	2	7	7	5
2015-12-29	0	0	0	0	0	0	0	0	0	0	...	0	0	0	5	1	5	2	7	7	5
2015-12-30	0	0	0	0	0	0	0	0	0	0	...	0	0	0	5	1	5	2	7	7	5
2015-12-31	0	0	0	0	0	0	0	0	0	0	...	0	0	0	5	1	5	2	7	7	5

Sid	FIBBG0000018G2	FIBBG000001J87	FIBBG000001JC2	FIBBG000001JD1	FIBBG000001NT5	FIBBG000001NV2	FIBBG000001SF9	FIBBG000002791	FIBBG0000027B8	FIBBG000002WJ5	...	QA000000001978	QA000000001981	QA000000001995	QA000000014708	QA000000014977	QA000000017129	QA000000018169	QA000000020127	QA000000021599	QA000000021660
Date
2015-12-24	False	False	False	False	False	False	False	False	False	False	...	False	False	False	False	False	False	False	False	False	True
2015-12-28	False	False	False	False	False	False	False	False	False	False	...	False	False	False	False	False	False	False	False	False	True
2015-12-29	False	False	False	False	False	False	False	False	False	False	...	False	False	False	False	False	False	False	False	False	True
2015-12-30	False	False	False	False	False	False	False	False	False	False	...	False	False	False	False	False	False	False	False	False	True
2015-12-31	False	False	False	False	False	False	False	False	False	False	...	False	False	False	False	False	False	False	False	False	True