Abstract
In this tutorial, we build upon the natural language processing (NLP) approach from the previous strategy. In this iteration, we monitor the Tiingo News Feed and try to determine the intraday news sentiment of the largest constituents in the Nasdaq-100 index while avoiding look-ahead bias. The results show that this version of the strategy experienced lower risk-adjusted returns than the QQQ exchange-traded fund (ETF) over the last two years.
Background
NLP is a subfield of artificial intelligence that strives to process unstructured text and understand its meaning. In most NLP trading strategies, the developer provides a set of pre-selected phrases and their sentiment scores, which usually introduces look-ahead bias into the strategy. In this algorithm, we circumvent this error by assigning sentiment scores to words on-the-fly based on how they impact the future returns of the security.
Method
Let’s review how we can implement this strategy as a framework algorithm with the LEAN trading engine.
Universe Selection
To get the largest constituents of the QQQ ETF, we add a custom ETF Constituents Universe Selection model and define the filter function to provide the 10 securities with the largest weight in the QQQ ETF.
def etf_constituents_filter(self, constituents: List[ETFConstituentData]) -> List[Symbol]:
selected = sorted([c for c in constituents if c.weight],
key=lambda c: c.weight, reverse=True)[:10]
return [c.symbol for c in selected]
Requesting News Articles
Everytime a security enters our ETF universe, we subscribe to its Tiingo News Feed.
self.dataset_symbol = algorithm.add_data(TiingoNews, symbol).symbol
Training the NLP Models
To ensure the algorithm is fit using the most recent news releases, we train a model for each security when they enter the universe and we schedule training sessions to re-fit the models at the beginning of every month.
algorithm.train(algorithm.date_rules.month_start(), algorithm.time_rules.at(7, 0), self.train_models)
During the training sessions, we use the following procedure for each security in the universe:
- Make a history request to gather the news releases and trading prices of the security over the last 30 days.
- Tokenize the news article text, drop the punctuation, and drop filler words like “the”, “a”, and “an”.
- Create a dictionary that maps each word to the expected future return of the security over the following 30 minutes.
Detecting Significant News
The NLP models transform the text of news releases into a prediction on the future returns of the respective security. Instead of trading in response to every news release, we only trade when an NLP model provides a prediction that’s \(n\) standard deviations away from the mean of the last 30 predictions. A larger value of \(n\) translates to fewer trades, but the trades are in response to news that carry more significance.
Emitting Insights
When an NLP model detects some significant news, the Alpha model emits an Insight with a duration of 30 minutes and a direction that matches the sentiment of the news release. That is, if the model determines the news release is positive, the insight has InsightDirection.UP
. Otherwise, it has InsightDirection.DOWN
.
direction = InsightDirection.UP if expected_return > 0 else InsightDirection.DOWN
insights.append(Insight.price(asset_symbol, self.PREDICTION_INTERVAL, direction))
Portfolio Construction
The Tiingo News Feed provides news every second an article is released. In this strategy, the goal is to immediately trade in response to news articles and hold the position for 30 minutes. The position size should only change during the 30 minutes if another significant news article is released for the same security and it has sentiment in the opposite direction of our trade. To achieve this, we create a custom Portfolio Construction model (PCM), called the PartitionedPortfolioConstructionModel
.
This PCM works by slicing the portfolio into \(p\) independent partitions. When the PCM receives an Insight for the first security, it allocates \(\frac{1}{p}\) of the portfolio capital to the security. The security price fluctuates over time, so its weight in the portfolio won’t stay fixed at \(\frac{1}{p}\) if \(p > 1\). When the model receives an Insight for another security, it calculates the number of vacant partitions \(v\) and then allocates \(\frac{1}{v}\) of the portfolio cash to the new security. The benefit of this design is that the PCM maintains the size of every trade until the Insight expires or the Alpha model emits a new insight in the other direction. The drawback of this design is that the portfolio can only hold up to \(p\) securities at any one time.
Results
We backtested the strategy from January 1, 2021 to January 1, 2023 and the algorithm achieved a -0.659 Sharpe ratio. To compare this performance, the following table shows the results of some benchmarks:
Benchmark | Sharpe Ratio |
---|---|
Buy-and-hold with the QQQ | -0.14 |
An equal-weighted portfolio of the same universe as the strategy | -0.11 |
In conclusion, the strategy underperforms the two preceding benchmarks in terms of risk-adjusted returns and it needs further development before live trading.
Derek Melchin
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!