Hello, below is an example of a machine learning algorithm using the XGBoost library. I am running a regression model on future returns with equity data. I am using indicator data and market data as features. I am using the RSI (Relative Strength Index) and ATR (Average True Range) indicators, as well as Market (SPY) ROCP (Rate Of Change Percent). The hypothesis is that these indicators will give my regressor model enough to make reasonably accurate predictions on price movements and then trade the ones that it predicts are going to move a significant amount. Notebook attached for more fundamental look at the model. I used RandomizedSearchCV() in this to save computation usage, but feel free to switch it to GridSearchCV() for more of a determined approach. 

Conclusion:

This is a good first step into ML using the QC platform. I would love to see improved versions posted as a fun challenge to anyone who is interested. I will be linking a medium article that is dropping about this algorithm as soon as it is published. Happy coding!

Disclaimer:

This is for entertainment purposes and not financial advice. Also, it is not advised to trade this algorithm at its current state as it still could use improvements and further review. The algorithm also has many assumptions that can be cleaned up by refactoring it with more dynamic code, such as the prediction value threshold that is currently hardcoded in as 0.5 for longs and -0.5 for shorts. There are many other improvements as well, for instance, a quick review of the overview tab reveals that the algorithm only had a win rate of 40% and a loss of 60%, even if the wins were on average larger, this could pose as a potential red flag. if you see something wrong or any problems, please comment below, let me know what is wrong and I will try to review and fix any issues. Also, I can only get this backtest to produce with outputs similar to this backtest with the v2 of the QuantConnect Development Environment. For some reason, the results are drastically different from the v2 IDE backtest than the v3 IDE backtest. I reached out to QuantConnect and they couldn't find a good reason for this other than possibly switching to GridSearch(), which I tried with no luck. Got me stumped, so if any smart people figure out why, be sure to share😊 Here is the code for the GridSearchCV() that I tried, in case anyone is interested. 

self.models[symbol] = GridSearchCV(estimator=self.models[symbol], param_distributions=parameters, n_iter=10, scoring='neg_mean_squared_error', cv=4, verbose=1)
                

 So, if you want similar results you will have to run this on the v2 IDE that you can switch to in your account settings as of the date of this post.