Data Format

Key Concepts

Introduction

From the beginning, LEAN has strived to use an open, human-readable data format - independent of any specific database or file format. From this core philosophy, we built LEAN to read its financial data from flat files on disk. Data compression is done in zip format, and all individual files are CSV or JSON.

The prices are expressed in the asset quote currency. For example, the value 0.06920 for ETHBTC is the amount of BTC, the quote currency, you need to buy 1 ETH.

When there is no activity for a security, the price is omitted from the file. Only new ticks and price changes are recorded.

Folder Structure

Data files are separated and nested in a few predictable layers:

  • Tick, Second and Minute: /data/securityType/marketName/resolution/ticker/date_tradeType.zip
  • Hour, Daily: /data/securityType/marketName/resolution/ticker.zip

The marketName value is used to separate different tradable assets with the same ticker. E.g. BTCUSDT is traded on multiple brokerages all with slightly different prices.

Price Representation

The prices are expressed in the asset quote currency. For example, the value 0.06920 for ETHBTC is the amount of BTC, the quote currency, you need to buy 1 ETH.

When there is no activity for a security, the price is omitted from the file. Only new ticks and price changes are recorded.

You can also see our Videos. You can also get in touch with us via Discord.

Did you find this page helpful?

Contribute to the documentation: