Hey Everyone,
Today I'm going to implement two concepts, (a) testing for stationarity and (b) normalizing data, in the research notebook. Using these methods can improve an Alpha’s ability to forecast and are valuable techniques to understand. The majority of time-series statistical forecasting methods are based on the assumption that the series being used is stationary or approximately stationary, whether on its own or through the use of transformations.
Briefly stated, a stationary series is one whose joint probability distribution remains the same over time. Put another way, it is a process whose mean, variance, autocorrelation, etc. are all constant over time.
Stationary series are useful because we can assume that its current statistical properties will remain constant. Given stationarity, we know approximately what future values will be and what sort of error will be present in our forecast, and this naturally leads us to think of taking advantage of the mean-reverting behavior of stationary series for trading purposes.
Unfortunately, equity prices are never stationary. However, we are able to approximate a stationary series by using the returns ("differencing" the series). This preserves the sequential nature of the series and other important properties and is usually a sufficient transformation to render a time-series stationary.
# Import our custom functions
from StationarityAndZScores import *
# Import the Liquid ETF Universe helper methods
from QuantConnect.Data.UniverseSelection import *
# Initialize QuantBook and the Sector ETFs
qb = QuantBook()
symbols = [x for x in LiquidETFUniverse.SP500Sectors]
# Fetch history and returns
history = qb.History(symbols, 500, Resolution.Hour)
returns = history.unstack(level = 1).close.transpose().pct_change().dropna()
To ensure that we can act on our assumption of stationarity, we need to test the data. One of the most common tests for stationarity is the augmented Dickey-Fuller (ADF) test. An ADF test operates on the null-hypothesis that a unit is present in a time series sample (i.e. the time-series is not stationary). Therefore, when we apply the test to our transformed data, we want to look for a p-value of less than 0.05 so that we can confidently reject the null-hypothesis in favor of the alternative: the series is stationary.
(I won't go into the details of how an ADF test works, but you can find plenty of information about it online and other resources if you want to dive further into the mathematics.)
def TestStationartiy(returns):
# Return pandas Series with True/False for each symbol
return pd.Series([adfuller(values)[1] < 0.05 for columns, values in returns.iteritems()], index = returns.columns)
We can also normalize the returns data and trade based on the z-score, which will give us an idea of how far of an outlier a given return is relative to its historical mean and variance.
def GetZScores(returns):
# Return pandas DataFrame containing z-scores
return returns.subtract(returns.mean()).div(returns.std())
In the research notebook, we can use these functions we've written to manipulate the data.
# Test for stationarity
stationarity = TestStationartiy(returns)
# Get z-score
z_scores = GetZScores(returns)
Now we're able to examine the data, transform it, test for stationarity, and create z-scores to use in our actual algorithm. In this demonstration strategy, we'll arbitrarily pick the time to enter a position as when the z-score is one or more standard deviations below zero. Similarly, we'll exit a position when the z-score is one or more standard deviations above zero.
def TransformTestTrade(self):
qb = self
symbols = [x.Symbol for x in qb.ActiveSecurities.Values]
# Copy and paste from research notebook
# -----------------------------------------------------------------------------
# Fetch history and returns
history = qb.History(symbols, 500, Resolution.Hour)
returns = history.unstack(level = 1).close.transpose().pct_change().dropna()
# Test for stationarity
stationarity = TestStationartiy(returns)
# Get z-scores
z_scores = GetZScores(returns)
# -----------------------------------------------------------------------------
insights = []
# Iterate over symbols
for symbol, value in stationarity.iteritems():
# Only emit Insights for those whose returns exhibit stationary behavior
if value:
# Get most recent z_score
z_score = z_scores[symbol].tail(1).values[0]
if z_score < -1:
insights.append(Insight.Price(symbol, timedelta(1), InsightDirection.Up))
elif z_score > 1:
if self.Portfolio[symbol].Invested:
insights.append(Insight.Price(symbol, timedelta(1), InsightDirection.Flat))
self.EmitInsights(insights)
The function above implements our trading strategy with minimal additions to the code we already wrote in the research notebook. To ensure that we trade or emit Insights frequently enough, we schedule this function to run every day 5-minutes after market-open.
def Initialize(self):
self.SetStartDate(2018, 11, 1) # Set Start Date
self.SetCash(1000000) # Set Strategy Cash
self.SetBrokerageModel(AlphaStreamsBrokerageModel())
self.SetBenchmark('SPY')
self.SetExecution(ImmediateExecutionModel())
self.SetPortfolioConstruction(EqualWeightingPortfolioConstructionModel())
self.UniverseSettings.Resolution = Resolution.Minute
self.SetUniverseSelection(LiquidETFUniverse())
self.AddEquity('XLE')
self.Schedule.On(self.DateRules.EveryDay('XLE'), self.TimeRules.AfterMarketOpen('XLE', 5), self.TransformTestTrade)
Even though this isn't the best strategy ever written, I was able to test out some useful statistical techniques for analyzing data and then was able to quickly transfer it into practice and test out my ideas. Hopefully, this helps you do the same!
Apollos Hill
You sir have raised the standard of human intelligence
Ivan Baev
I think it should be used in a risk estimation module, as an additional check (out of many) that market is healthy and we are not getting false-positive insights.
Bala Vignesh
Is there a way to select symbols in a similar way from the QC500UniverseSelectionModel?
Rahul Chowdhury
Hey Bala,
You can create your own custom UniverseSelectionModels using coarse and fine selection. Learn more here.
If you want to see how QC500UniverseSelectionModel works you can check out the github.
Jack Simonson
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!