I'm using Lean-CLI and researching over daily historical data that I created manually and stored in my Lean's “data” directory under the appropriate subdirectories. In this example, it's `[lean directory]/data/equity/usa/daily/cron.zip`.
I preview the data using the following script:
qb = QuantBook()
cron = qb.AddEquity("CRON")
qb.History(cron.Symbol, 100, Resolution.Daily)
Here are the file data and results in 3 different situations:
A)
20211019 00:00,73300,73900,69300,69600,1
close high low open volume
time
2021-10-20 6.96 7.39 6.93 7.33 1.0
2021-10-21 6.96 7.39 6.93 7.33 0.0
2021-10-22 6.96 7.39 6.93 7.33 0.0
B)
20211021 00:00,73300,73900,69300,69600,1
close high low open volume
time
2021-10-22 6.96 7.39 6.93 7.33 1.0
C)
20211019 00:00,73300,73900,69300,69600,1
20211021 00:00,73300,73900,69300,69600,2
close high low open volume
time
2021-10-20 6.96 7.39 6.93 7.33 1.0
2021-10-21 6.96 7.39 6.93 7.33 0.0
2021-10-22 6.96 7.39 6.93 7.33 2.0
There's some interpolating going on as well as some time shifting and I can't make heads or tails as to why. I've tried reading up on the docs regarding how to properly format the CSV files but it's not clear what timezone the timestamps should be in (what does “…in the timezone of the data format” mean?). Also, why in example “A” were two additional dates created?
I download the data using ib_insync which by default uses TWS's logged-in timezone so I know what timezone to convert from, just not what to convert to. Should I also be specifying an hour which would keep the resulting dates in line with the trading calendar?
Louis Szeto
Hi Jake
All the 3 situations are normal. It is because the date you're running the research is 22-10-2021, so 100-bar historical data will start from the custom data's start date, or 100-bar prior, whichever is later. The rest of the additional dates are fill-forward points, as all values are the same as the previous date, with the volume of 0. It is possible to call historical data only for rows with data points by:
where start_date and end_date can be set by e.g. start_date = DateTime(2020, 6, 10)
Best
Louis
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
Jake Mitchell
Ok, I understand how the fillDataForward attribute works now, but why would the resulting records have shifted dates with respect to the source dataset?
Source:
Notice the 1st record's start date of 2021-10-12
Now running:
Results in:
See how all of the dates change (shifted forward in time)? Ignore the other columns, they printed out differently due to it being stored in a pandas dataframe vs. being copied from the CSV file in the 1st table.
Varad Kabade
Hi Jake Mitchell,
We need to note that the data frame returned from a history request is time-stamped with the end-of-period, whereas the data in the file is the start time.
Best,
Varad Kabade
Jake Mitchell
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!