Coming from Python and being relatively new in C#, I thought it would be helpful to have an example strategy that utilizes the Accord.NET machine learning library. Since I couldn't find any good examples, I coded one myself.
This strategy trains a Linear SVM (Support Vector Machine) with historical returns. Then at the open of the market, it attempts to predict whether the market will close UP or DOWN. If the trend is predicted to be UP, we enter a LONG position. If the trend is DOWN, we exit the market. If we are already in a LONG position, we do nothing. As mentioned, I am quite new to C#, so there might be bugs and there is certainly room for improvement. Would be nice to discuss possible improvements/ideas here.
Jared Broad
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
Michael Handschuh
Gene Wildhart
# tslag gets the shifted price inputs for i in range (0,inputSize): tslag["Lag%s" % str(i+1)] = df["Close Price"].shift(i+1) # now tslag holds the returns (ie. pct change) tslag = tslag.pct_change() # tsout will hold the returns and be the target for the SVM tsout = df["Close Price"].pct_change();
Unfortunately, my C# skills are not up to snuff, so I'm not sure how to do the above in C# w/o lots of looping.Dmitri Pavlenkov
Michael Handschuh
Gene Wildhart
Dmitri Pavlenkov
double[] returns = new double[inputSize];
underfor (int i=0;i
otherwise, all your inputs will be the same.
Dmitri Pavlenkov
Robert Graves
Michael Handschuh
const string SvmData = @"1,2,1,0,2,-1"
And then reference it from your algorithm and use it to hydrate your SVM instance.Jared Broad
using(var wc = new WebClient()) { //Point the web client to your own data store. _data = wc.DownloadString("https://www.google.com"); }
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
Robert Graves
Dmitri Pavlenkov
const string data = "A string with international characters: Norwegian: ÆØÅæøå, Chinese: ? ??"; var bytes = System.Text.Encoding.UTF8.GetBytes(data); var stream = new System.IO.MemoryStream(bytes);
James Smith
I've found that more recent versions of Accord have a serializer class which is more flexible:
http://accord-framework.net/docs/html/T_Accord_IO_Serializer.htm
I've been toying with the accord libraries, so thanks for sharing. One thing I'd like to do is incremental leaning, which I'm not sure Accord supports? It appears my only option would be to add new results to the learning set and completely rebuild the svm. Does anyone know of an alternative approach or library for incremental learning?
Petter Hansson
I usually just retrain everything with a certain lookback period (so when enough has filled up, the old data outside lookback gets truncated).
In some cases you can do incremental fitting in Accord by just running a single (or otherwise few) learning iterations on your model with a new piece of data, e.g. logistic regression can do this. However, then most of the fitting is most likely on the most recent data and there's not a lot of control over this.
Can't use Accord in QC cloud atm due to class load error but that's of course no problem if you're running locally. It's a shame because Accord is the only whitelisted library with SVM.
Petter Hansson
After googling a bit incremental learning of SVM is supposedly possible but difficult, and I haven't seen support in Accord for it.
James Smith
Yes I've found some obscure options for incremental but think I will miss the rich features of accord. Besides, I expect there may be diminishing returns in prediction accuracy as the lookback grows. There may be a threading solution to getting adequate backtest performance with frequent model retraining.
Petter Hansson
One thing I've considered is to simply have models retrain on a background thread on increasingly extensive data (e.g. increasing lookback) until a deadline or when the main thread submits a new data set. That is similar to iterative deepening concept in game AI. However, I'm typically wary of doing something in a backtest that will work differently when running live (e.g. a backtest with once a day retraining would quickly cut off training, whereas live version would probably reach maximum lookback).
And yes, the lookback is a hyperparameter that's likely to have a large impact on the model's accuracy in practice on live data, and what's worse, in most cases the best lookback varies over time...
Petter Hansson
It would probably be possible to do like this however: Retrain model with fixed lookback, first time, wait for it to finish on main thread, after that, let main thread continue with the oldest finished model (just update the next training set). So in a backtest one would be using outdated models with probably worse performance than the live version which would have more recently trained models.
James Smith
Have to agree that a background learning task is a minefield. In theory a less delayed lookback will lead to more accurate prediction, but it might equally be that the more recent signals are the noise of indecision that comes before a significant move. It's only standing on steady ground to have backtest behaviour that you're confident will reproduce. The problem is that regardless of the trade frequency relearning a significant set is simply too slow to backtest. Of course caching is an option, but then one tweak here or there and you need to rebuild your model cache.
Gene Wildhart
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!