Importing Data

Key Concepts

Introduction

Custom data is your own external data that's not from the Dataset Market. You can use custom data to inform trading decisions and to simulate trades on unsupported securities. To get custom data into your algorithms, you download the entire file at once or read it line-by-line with a custom data reader. If you use a custom data reader, LEAN sends the data to the OnDataon_data method in your algorithm.

File Providers

The most common file providers to use are the Object Store, Dropbox, GitHub, and Google Sheets.

Object Store

The Object Store is the fastest file provider. If you import files from remote providers, you will be restricted by their rate limits and your download speed.

Dropbox

If you store your custom data in Dropbox, you need to create a link to the file and add ?dl=1 to the end of the file URL. To create file links, see How to share files or folders in the Dropbox documentation. The ?dl=1 parameter lets you download the direct file link, not the HTML page of the file.

GitHub

If you store your custom data in GitHub, you can use public or private repositories. When you download the data, use the raw file (for example, https://raw.githubusercontent.com/<organization>/<repo>/<path>). For instructions on accessing the raw file, see Viewing or copying the raw file content in the GitHub documentation.

Google Sheets

If you store your custom data in Google Sheets, you need to create a link to the file. To create file links, see Make Google Docs, Sheets, Slides & Forms public in the Google Docs documentation. Choose the CSV format to find /pub?gid={SHEET_ID}&single=true&output=csv parameter on the end of the link. Google Sheets don't support exporting data in JSON format.

Stream Custom Data

To receive your custom data in the OnDataon_data method, create a custom type and then create a data subscription. The custom data type tells LEAN where to get your data and how to read it.

All custom data types must extend the BaseDataPythonData class and override the GetSourceget_source and Readerreader methods

public class MyCustomDataType : BaseData
{
    public override DateTime EndTime { get; set; }
    public decimal Property1 { get; set; } = 0;

    public override SubscriptionDataSource GetSource(
        SubscriptionDataConfig config,
        DateTime date,
        bool isLive)
    {
        return new SubscriptionDataSource("<sourceURL>", SubscriptionTransportMedium.RemoteFile);
    }

    public override BaseData Reader(
        SubscriptionDataConfig config,
        string line,
        DateTime date,
        bool isLive)
    {
        if (string.IsNullOrWhiteSpace(line.Trim()) || char.IsDigit(line[0]))
        {
            return null;
        }

        var data = line.Split(',');
        return new MyCustomDataType()
        {
            Time = DateTime.ParseExact(data[0], "yyyyMMdd", CultureInfo.InvariantCulture),
            EndTime = Time.AddDays(1),
            Symbol = config.Symbol,
            Value = data[1].IfNotNullOrEmpty(
                s => decimal.Parse(s, NumberStyles.Any, CultureInfo.InvariantCulture)),
            Property1 = data[2].IfNotNullOrEmpty(
                s => decimal.Parse(s, NumberStyles.Any, CultureInfo.InvariantCulture))
        };
    }
}
class MyCustomDataType(PythonData):
    def get_source(self,
         config: SubscriptionDataConfig,
         date: datetime,
         is_live: bool) -> SubscriptionDataSource:
        return SubscriptionDataSource("<sourceURL>", SubscriptionTransportMedium.REMOTE_FILE)

    def reader(self,
         config: SubscriptionDataConfig,
         line: str,
         date: datetime,
         is_live: bool) -> BaseData:

         if not (line.strip() and line[0].isdigit()):
            return None

         data = line.split(',')

        custom = MyCustomDataType()
        custom.time = datetime.strptime(data[0], '%Y%m%d')
        custom.end_time = custom.time + timedelta(1)
        custom.value = float(data[1])
        custom["Property1"] = float(data[2])
        return custom

For more information about custom data types, see Streaming Data.

Download Bulk Data

The Downloaddownload method downloads the content served from a local file or URL and then returns it as a string.

var content = Download("<filePathOrURL>");
content = self.download("<filePathOrURL>")

For more information about bulk downloads, see Bulk Downloads.

File Quotas

There are no limits to the number of files you can load from the Object Store during a single backtest or Research Environment session in QuantConnect Cloud.

The following table shows the number of remote files you can download during a single backtest or Research Environment session in QuantConnect Cloud:

Organization TierFile Quota
Free25
Quant Researcher100
Team250
Trading Firm1,000
FreeUnlimited

Remote files can be up to 200 MB in size and can have names up to 200 characters long.

Rate Limits

We do not impose a rate limit on file downloads but often external providers do. Dropbox caps download speeds to 10 kb/s after 3-4 download requests. To ensure your algorithms run fast, only use a small number of small custom data files or use the Object Store.

Timeouts

In cloud algorithms, the download methods have a 10-second timeout period. If the methods don't download the data within 10 seconds, LEAN throws an error.

You can also see our Videos. You can also get in touch with us via Discord.

Did you find this page helpful?

Contribute to the documentation: