I'm using Lean-CLI and researching over daily historical data that I created manually and stored in my Lean's “data” directory under the appropriate subdirectories. In this example, it's andirecrydataequityusadailycron.zip.

I preview the data using the following script:

  1. qb = QuantBook()
  2. cron = qb.AddEquity("CRON")
  3. qb.History(cron.Symbol, 100, Resolution.Daily)

Here are the file data and results in 3 different situations:

A)

  1. 20211019 00:00,73300,73900,69300,69600,1
  1. close high low open volume
  2. time
  3. 2021-10-20 6.96 7.39 6.93 7.33 1.0
  4. 2021-10-21 6.96 7.39 6.93 7.33 0.0
  5. 2021-10-22 6.96 7.39 6.93 7.33 0.0

B)

  1. 20211021 00:00,73300,73900,69300,69600,1
  1. close high low open volume
  2. time
  3. 2021-10-22 6.96 7.39 6.93 7.33 1.0

C)

  1. 20211019 00:00,73300,73900,69300,69600,1
  2. 20211021 00:00,73300,73900,69300,69600,2
  1. close high low open volume
  2. time
  3. 2021-10-20 6.96 7.39 6.93 7.33 1.0
  4. 2021-10-21 6.96 7.39 6.93 7.33 0.0
  5. 2021-10-22 6.96 7.39 6.93 7.33 2.0

There's some interpolating going on as well as some time shifting and I can't make heads or tails as to why. I've tried reading up on the docs regarding how to properly format the CSV files but it's not clear what timezone the timestamps should be in (what does “…in the timezone of the data format” mean?). Also, why in example “A” were two additional dates created?

I download the data using ib_insync which by default uses TWS's logged-in timezone so I know what timezone to convert from, just not what to convert to. Should I also be specifying an hour which would keep the resulting dates in line with the trading calendar?

Author

Jake Mitchell

October 2021