G'day QuantConnect team!
I am trying to import a custom dataset of ~100,000 rows - from Dropbox - as a SubscriptionDataSource. However, I noticed that QuantConnect's back tester doesn't like this. It simply gives up and loads no data.
What are the size limits for custom datasets, and is there a way I can avoid these limitations? For example, could I use a different website or somehow import the data locally?
Cheers!
(Please note a dataset of ~60,000 rows works fine)
Mia Alissi
Hello!
QuantConnect allows you to import custom data for your algorithms. However, there are certain limitations you need to be aware of. If you store your custom data in Google Sheets, you need to create a link to the file in CSV format. Google Sheets doesn't support exporting data in JSON format.
The number of files you can download during a single backtest or Research Environment session in QuantConnect Cloud depends on your organization tier. For example, a free tier allows 25 files, a Quant Researcher tier allows 100 files, and so on. Each file can be up to 200 MB in size and have a file name up to 200 characters long.
If you need to import more files than your quota allows, you can save your custom data files in the Object Store and load them from there. There is no rate limit imposed on file downloads by QuantConnect, but often external providers do. For instance, Dropbox caps download speeds to 10 kb/s after 3-4 download requests. To ensure your algorithms run fast, it's recommended to use a small number of small custom data files.
In cloud algorithms, the download methods have a 10-second timeout period. If the methods don't download the data within 10 seconds, LEAN throws an error.
If you're facing issues with larger datasets, you might want to consider splitting your data into smaller files or using a different data storage solution that doesn't have such limitations.
For more details, you can refer to the QuantConnect Documentation.
Here's an example of how you can import custom data from a local file:
I hope this helps! Let me know if you have any other questions.
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
Tom Frew
Thanks a lot Mia! Do the data limitations apply to local files? And how could I get a local file into my data folder?
Mia Alissi
The data limitations mentioned earlier primarily apply to data downloaded from external sources during a backtest or Research Environment session in QuantConnect Cloud. When running the LEAN engine locally, you can use your own local data, including custom data.
To use local data, you need to copy the data file you want to use to the data directory in your organization workspace. You can configure the data directory to use in the
data-folder
property in your Lean configuration file. The path this property is set to is used as the data directory by all commands that run the LEAN engine locally. By default, this property points to the data directory inside your organization workspace. If this property is set to a relative path, it is resolved relative to the Lean configuration file's parent directory.Here's an example of how you can load data from a local file in your data directory:
In this example,
Globals.DataFolder
is used to get the path to the data directory. You need to replace"custom_weather_data.csv"
with the path to your data file in the data directory.For more information, please refer to the QuantConnect Documentation.
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
Tom Frew
I really appreciate that! Is there any way I can do this outside of LEAN - i.e. on the web version, in the same directory as my main.py file?
Mia Alissi
Unfortunately, the web version of QuantConnect does not support direct access to local files in the same directory as your main.py file. The platform is designed to access data from the QuantConnect data library or from external sources via the web.
However, you can upload your custom data to the QuantConnect platform and access it from there. To do this, you can use the
Download
method to access data from a URL, or you can use theQuantConnect.Python
class to access data from a local file.If you want to use data from Dropbox, you can use the
Download
method with the Dropbox link as the argument. However, please note that Dropbox caps download speeds to 10 kb/s after 3-4 download requests, so it's recommended to use a small number of small custom data files.For more information, please refer to the QuantConnect Documentation.
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
Tom Frew
Alright then. How about splitting a 300Mb dataset into two parts and loading them separately? Would it be possible to do this for a SubscriptionDataSource?
Mia Alissi
Yes, you can split your large dataset into smaller parts and load them separately as
SubscriptionDataSource
. You would need to create separateSubscriptionDataSource
instances for each part of your split dataset.Here's an example of how you can do this:
In this example,
Globals.DataFolder
is used to get the path to the data directory. You need to replace"part1.csv"
and"part2.csv"
with the paths to your data files in the data directory.Please note that each file can be up to 200 MB in size and have a file name up to 200 characters long. For more information, please refer to the QuantConnect Documentation.
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
Tom Frew
You really know your stuff, Mia! Thanks so much, you've been a great help :)
Mia Alissi
You're welcome, Tom! I'm glad I could help. Don't hesitate to reach out if you have any more questions in the future. Happy coding!
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
Jared Broad
Hey Tom Frew! We can now import files from the Object Store for backtesting! I've asked the engineer responsible to post a demonstration here shortly. After ten downloads, Dropbox rate limits download requests down to 1-2kb/s, which means it's not a good solution for large files. The Object Store is local to the backtesting nodes and is accessible at 10GBs.
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
Jhonathan Abreu
Hi Tom Frew! I have updated this issue with some demo algorithms you can refer to about importing custom data from the object store. They showcase how to handle custom data imported from a single file, multiple files and zip files.
Please let me know any questions or concerns you may have about it and I'll look into it ASAP.
Enjoy!
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
Tom Frew
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!