I get the following error when trying to run my algorithm on timeframes longer than 2 months for all tickers above $20M in trading volume (about 1400 tickers) reducing this number won't help this problem too much but I can't reduce this due to my strategy.
Runtime Error: Exception of type 'System.OutOfMemoryException' was thrown. in SubscriptionSynchronizer.cs:line 130. this problem has nothing to do with RAM usage as only 15% of my RAM gets utilized in the backtests. This problem apparently stems from the results being too large to upload to quantconnect because it may have hit the 700MB data limit and the upload times out
Alex from support was able help improve the efficiency of my algorithm but only so much.
I am using this logic to add the data from benzinga from my universe:
def OnSecuritiesChanged(self, changes):
for security in changes.AddedSecurities:
self.benzingas[security.Symbol] = self.AddData(BenzingaNews, security.Symbol).Symbol
for security in changes.RemovedSecurities:
if security.Symbol in self.benzingas:
self.RemoveSecurity(self.benzingas[security.Symbol])
del self.benzingas[security.Symbol]
How can I avoid hitting this 700MB data limit in my backtests? what causes the most data usage in backtests?
Just thinking, is there an easy way to give my backtests a lookahead bias to only select tickers that will be used by the benzinga dataset for the next day?
Fred Painchaud
Hi Garrison,
Reasons can be many. Many really. If you are saying it is not your physical memory, it can be the memory allocated to Python. It can be the virtual memory limit set on your system. Etc. This refers to C# but is rather generic, for examples of potential problems: https://docs.microsoft.com/en-nz/archive/blogs/ericlippert/out-of-memory-does-not-refer-to-physical-memory.
Now. You are saying there is a 700MB limit on file size download. Is it 700MB total or per file? If it is total, you are stuck with downsizing your sample of assets, but it looks you cannot do it. Maybe change your strategy all-together. If it is 700MB per file, maybe there is some config you can play with on the BenzingaNews side. I have no idea.
You might be able to pay to get a higher limit?
Finally you may run your algo on a different machine on which you do not have those limits.
The most data usage in backtests highly depend on the backtest. Usually, it is data subscription. But I assure you one can write an algo that subscribes to only one asset and then generate 5x, 10x, more data from that initial data.
It's not easy to do look-ahead in QC, on purpose. Plus, backtesting an algo with look-ahead, on purpose or not, makes little sense since the performance of your algo in backtest will be much better than live. So then, don't backtest, just say your algo generates 50% profit / year compounded and go live. It would be equivalent… Don't waste your time backtesting an algo that looks ahead. Just saying so you don't waste your time in that direction.
Fred
Garrison Whipple
Fred, the lookahead bias I am referring to is only for the universe selection, so only selecting the 20 tickers that would have had a related news article for it anyway so that I don't have all of these data subscriptions from the benzinga dataset. That seems to be the main thing making my algorithm so bulky. I believe it can be done from a google sheet but I will have to look into that more.
I saw Jareds post on this thread but I wonder if that improvement will even help my case.
Fred Painchaud
Hi Garrison,
Using any information from the future during a backtest will usually positively impact the performance of that backtest and then that boost in performance cannot be reproduced live, so it is useless to measure that said performance in backtest. Moreover, if your algo uses less data in backtest but more live (since you won't see in the future live and won't be able to scale down your universe selection), maybe it will use too much data again live and will exceed certain limits and thus will error out.
Fred
Garrison Whipple
Does anyone know innovative/creative ways to reduce the data that is uploaded to quant connects servers from backtests?
Derek Melchin
Hi Garrison,
Ways to reduce the amount of backtest data include:
Best,
Derek Melchin
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
Garrison Whipple
Derek Melchin I have already done everything I possibly could to minimize the data, the problem is that quantconnects cloud infrastructure only allows something of 700mb of data before a backtest upload times out. The external datasets seem to eat up a lot of that mb allowance. When will quantconnect update their infrastructure?
Varad Kabade
Hi Garrison,
Unfortunately, there is no ETA, but we will look into making improvements to allow uploading backtests results that regenerate more than that amount of data.
Best,
Varad Kabade
Garrison Whipple
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!