Howdy QC Community!
What is the most efficient way of creating a dataframe of historical price data in an algorithm?
Every day, I create a pandas dataframe of historical daily prices for 100+ securities and use that to calculate a portfolio.
My initial attempt was to do something like this:
history = self.History(symbols, 365, Resolution.Daily)
Grabbing this data has slowed down my algo significantly to about 3 real hours per 1 year of simulated time.
It reads in the docs:
> History call fetches the whole requested period and synchronizes the data.
So I tried this approach:
def Initialize(self):
...
for symbol in symbols:
self.Consolidate(
symbol = symbol,
period = Resolution.Daily,
handler = self.process_daily_bar)
self.history = {}
def process_daily_bar(conosolidated):
symbol = str(consolidated.Symbol)
date = consolidated.EndTime.date()
self.history[symbol, date] = consolidated.Close
But this ended up taking the algo significantly longer. Almost 24 real hours per 1 year of simulated time.
Is there something I'm missing? Is there a better approach?
Thank you so much!
Mak K
Hi Jonathan,
I would recommend that you use Rolling Windows for this.
They will initially grab the historical data for you but then simply update the dataframe with the new values every day as the data comes into your algorithm.
This should save you a lot of time when backtesting.
Let me know if you have further questions, thanks!
Jonathan Ng
The Rolling Windows part is just a way of storing the data, correct? Does it really make it different performance-wise if I stored the data in a dictionary vs in a RollingWindow structure?
Jonathan Ng
It's only daily data, so it's not that much.
Mak K
Hi,
If you use the Rolling Windows you avoid calling a lot of data everyday, which is what slows down your algorithm by a lot.
Every history call that you are making you are calling that entire data again.
So it is not so much about the storing of data but the calling of the data.
Generally it is advised by QuantConnect to use History Calls sparingly
Let me know if you have any further questions, thanks!
Jonathan Ng
To be clear, I'm not calling self.History in the code example below “So I tried this approach:”
I'm writing data within a consolidation handler.
And writing data to a RollingWindow or writing data to a dictionary shouldn't make difference performance-wise, right?
Mak K
Hi,
I' assuming that your consolidation is grabbing each price over and over again instead of saving them and simply adding the last value and removing the oldest.
A Rolling Window will perform one history call at the start of your algorithm and then pop the oldest value and append the newest each day while keeping all of the other values untouched in the Rolling Window.
Please try out the Rolling Window and get back to me with the results or any issues that you face, thanks!
Jonathan Ng
This is what I'm doing:
Jonathan Ng
I emailed support, and they have confirmed that RollingWindow is not necessarily more efficient. It really depends on how often you plan on calling self.History.
Using RollingWindow in itself is not more efficient than any other data structure.
Fred Painchaud
Hi Jonathan,
Mak's point was that calling History over each day for 100 assets is going to cost a lot performance wise.
He meant that instead of calling History, you could store the day data as it goes into a RollingWindow.
But with only a fraction of Initialize and one function, it is difficult to further help spotting the performance hog.
Cheers!
Fred
Jonathan Ng
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!