Abstract

In this tutorial, we test four new alternative data strategies that base their trading decisions on the US Regulatory Alerts dataset. The first strategy capitalizes on movement in the healthcare sector in response to announcements from the U.S. Food and Drug Administration (FDA). The second strategy captures momentum in the Bitcoin-USD trading pair in reaction to new Crypto regulations. The third strategy seeks to exploit trading patterns in the SPY that form from specific regulatory alerts. The fourth strategy is a country rotation strategy that uses a novel natural language processing (NLP) approach to detect the sentiment of country exchange-traded funds (ETFs) without introducing look-ahead bias. The results show that all four strategies outperform their respective benchmarks.

Background

NLP is a subfield of artificial intelligence that strives to process unstructured text and understand its meaning. In most NLP trading strategies, the developer provides a set of pre-selected phrases and their sentiment scores, which usually introduces look-ahead bias into the strategy. In this algorithm, we circumvent this error by assigning sentiment scores to words on-the-fly based on how they impact the future returns of the security.

Country rotation is a strategy where we move capital among a set of countries in an effort to outperform the overall market. The idea is that each country's ETF is an independent security and we can forecast the future financial performance of each country based on the sentiment of their regulatory alerts. If we can adjust the exposure of our portfolio to align with the public sentiment of each country, we can position our portfolio to benefit from positive and negative regulatory changes while diversifying on an international scale.

Method

Let’s review how we can implement these four strategies with the LEAN algorithmic trading engine.

Strategy #1 - FDA Announcements

To implement this strategy, we check the title of the regulatory alert articles everyday. If an article title contains the “FDA” acronym, we buy the healthcare sector ETF, XLV. Otherwise, we short XLV.

Strategy #2 - Crypto Announcements

To implement this strategy, we read the title and summary of the regulatory alert articles everyday. If an article contains the word “Crypto” and the BTCUSD trading pair is trending up, we buy BTCUSD. Otherwise, we short BTCUSD.

Strategy #3 - Trading Patterns from Individual Alert Types

To implement this strategy, we fit a model every Sunday that uses the AlertType of each regulatory alert to predict the expected returns of SPY over the following 24 hours. Each day, we make predictions for the future returns of SPY. If the aggregated prediction is positive, we buy SPY. Otherwise, we short SPY.

Strategy #4 - NLP Country Rotation

To implement this strategy, we first subscribe to US Regulatory Alerts.

self.dataset_symbol = self.add_data(RegalyticsRegulatoryArticles, "REG").symbol

Next, we gather a set of country ETFs. To avoid selection bias, we include as many countries in the universe as possible. To get the ETF that represents each country, we opened the ETF Country Exposure Tool on the ETF DB website, selected each country that’s included in the US Regulatory Alerts dataset, and then chose the ETF with the largest weight. The following table shows the ETFs in the universe:

Country Ticker Description
Argentina ARGT Global X MSCI Argentina ETF
Australia EWA iShares MSCI-Australia ETF
Austria EWO iShares MSCI Austria ETF
Belgium EWK iShares MSCI Belgium ETF
Brazil EWZS iShares MSCI Brazil Small-Cap ETF
Canada FLCA Franklin FTSE Canada ETF
China CHIU Global X MSCI China Utilities ETF
Colombia GXG Global X MSCI Colombia ETF
Croatia FM iShares MSCI Frontier and Select EM ETF
Cyprus SIL Global X Silver Miners ETF
Czech Republic NLR VanEck Uranium+Nuclear Energy ETF
Denmark EDEN iShares MSCI Denmark ETF
Egypt EGPT VanEck Egypt Index ETF
Estonia FM iShares MSCI Frontier and Select EM ETF
Finland EFNL iShares MSCI Finland ETF
France FLFR Franklin FTSE France ETF
Germany FLGR Franklin FTSE Germany ETF
Greece GREK Global X MSCI Greece ETF
Hong Kong EWH iShares MSCI Hong Kong ETF
Hungary CRAK VanEck Oil Refiners ETF
Indonesia EIDO iShares MSCI Indonesia ETF
Ireland EIRL iShares MSCI Ireland ETF
Israel EIS iShares MSCI Israel ETF
Italy FLIY Franklin FTSE Italy ETF
Japan DFJ WisdomTree Japan SmallCap Dividend Fund
Libya BRF VanEck Brazil Small-Cap ETF
Luxembourg SLX VanEck Steel ETF
Malaysia EWM iShares MSCI Malaysia ETF
Malta BETZ Roundhill Sports Betting & iGaming ETF
Mexico FLMX Franklin FTSE Mexico ETF
Netherlands EWN iShares MSCI Netherlands ETF
Norway NORW Global X MSCI Norway ETF
Pakistan PAK Global X MSCI Pakistan ETF
Poland EPOL iShares MSCI Poland ETF
Portugal PGAL Global X MSCI Portugal ETF
Qatar QAT iShares MSCI Qatar ETF
Republic of the Philippines EPHE iShares MSCI Philippines ETF
Romania FM iShares MSCI Frontier and Select EM ETF
Russia FLRU Franklin FTSE Russia ETF
Saudi Arabia FLSA Franklin FTSE Saudi Arabia ETF
Singapore EWS iShares MSCI Singapore ETF
South Korea FLKR Franklin FTSE South Korea ETF
Spain EWP iShares MSCI Spain ETF
Sri Lanka FM iShares MSCI Frontier and Select EM ETF
Sweden EWD iShares MSCI Sweden ETF
Switzerland EWL iShares MSCI Switzerland ETF
Taiwan FLTW Franklin FTSE Taiwan ETF
Thailand THD iShares MSCI Thailand ETF
The Bahamas RNSC Furst Trust Small Cap US Equity Select ETF
Turkey TUR iShares MSCI Turkey ETF
Ukraine TLTE FlexShares Morningstar Emerging Markets Factor Tilt Index
United Arab Emirates UAE iShares MSCI UAE ETF
United Kingdom EWUS iShares MSCI United Kingdom Small-Cap ETF
United States SPY SPDR S&P 500 ETF
Vietnam VNAM Global X MSCI Vietnam ETF

Using the ETFs in the preceding table, we create a static universe of country ETFs.

self.etf_by_country = {}
for country, etf_ticker in etf_by_country.items():
    self.etf_by_country[country] = CountryETF(country, self.add_equity(etf_ticker, Resolution.DAILY).symbol)

To ensure the algorithm is fit using the most recent regulatory alerts, we train a model at the beginning of the algorithm and we schedule training sessions to re-fit the model at the end of every month.

self.train(self.date_rules.month_end(), self.time_rules.at(23,0), self.train_model)
self.train_model()

The train_model method makes history requests to gather the daily returns of each country ETF and all the regulatory alerts over the trailing 90 days. Each regulatory alert can be tagged with the name of the country from which it’s released, so we classify the regulatory alerts into their respective countries and then train a unique model for each country. To train the model, we use the following procedure:

  1. Get the words of each article title.
  2. Tokenize the title and drop filler words like “the”, “a”, and “an”.
  3. Create a dictionary that maps each word to the expected future return of the country ETF over the following 24 hours.

Each day, we then parse all the regulatory alerts to find the sentiment of each country. We drop countries where their corresponding ETF is too illiquid to trade, meaning our target position size is greater than 1% of its average dollar volume over the last 3 months. For each country left over that has a positive sentiment, we enter a long position. For each country that has a negative sentiment, we enter a short position. Finally, we allocate an equal portion of the portfolio to each country ETF and scale the position sizes to achieve 1x leverage.

Results

We backtested each strategy using all the data from the US Regulatory Alerts dataset, which currently spans from September 2021 to July 2023.

Strategy #1 - FDA Announcements

This strategy achieved a 0.76 Sharpe ratio. It outperforms buy-and-hold with the XLV healthcare sector ETF, which results in a 0.161 Sharpe ratio over the same time period. To reproduce our results, backtest this algorithm.

Strategy #2 - Crypto Announcements

This strategy achieved up to a 1.036 Sharpe ratio. It can outperform buy-and-hold on Bitcoin over the same time period, but the strategy is sensitive to changes in the lookback window. To reproduce our results, backtest this algorithm.

Strategy #3 - Trading Patterns from Individual Alert Types

This strategy achieved a -0.077 Sharpe ratio. It outperforms buy-and-hold with the SPY ETF, which results in a 0.111 Sharpe ratio over the same time period. To reproduce our results, backtest this algorithm

Strategy #4 - NLP Country Rotation

This strategy achieved a 0.536 Sharpe ratio. It outperforms the following benchmarks:

  • Buy-and-hold with the SPY ETF, which results in a 0.111 Sharpe ratio over the same time period.
  • Buy-and-hold with the iShares MSCI World ETF, URTH, which results in a 0.014 Sharpe ratio.
  • An equal-weighted portfolio of all the country ETFs, which results in a -0.092 Sharpe ratio.

During the backtest period, the algorithm rebalanced the portfolio 493 times. The algorithm determined the United States had the most negative regulatory alerts, shorting the US 281 times during the 493 rebalances (57%). The algorithm determined United States had the most positive regulatory alerts, longing the USA 212 times during the 493 rebalances (43%).

To reproduce our results, run the following algorithm.