After finding this discussion, I got Lean working locally, but apparently, from this post to the email list, you can't backtest locally using the online data source. Since at this point I'm just exploring Lean and potential algorithms backtesting seems the correct way to learn and develop algorithms.
So what is the suggested workflow, it seems developing locally is much more efficient, but then it looks like I need to get the code up to the Algorithm Lab for backtesting. But I haven't discovered any mechanisms for easily getting code between my local system and the Algorithm Lab. Or maybe my thinking is way off track, so what type of workflows would people suggest?
-- Wink
Alexandre Catarino
I would suggest you to use free daily data for earliest stage of the development, where your are coding up the algorithm logic and chosing the tools that are available. At this stage, you probably need debugging tools. In order to get daily data, you can use the YahooDownloader to get Yahoo data in Lean format.
After that, you can either copy and past your code in the Algorithm Lab or you can use the API.
Wink Saville
The API looks fairly straight forward, I could envison creating a command line interface to the API. But is there a way to run Lean "interactively", right now Lean seems batch oriented but maybe there is other API's to make it interactive?
Regarding using the YahooDownloader, what about using something like AddData<Yahoo>("AAPL"), does that work "locally" or is that something different? I saw it in the "How Do I import data from Yahoo" in the University here.
Petter Hansson
I compile locally in MSVS (and perform source code versioning locally as well), then copy paste code to QC terminal to run it. Some notes if doing this:
- Don't make source files too big (then QC won't accept them, aim for max 800 lines), go with partial classes instead.
- QC will remember the exact code you used for a certain backtest (by cloning that backtest), so if you forget what version your code on QC corresponds to it provides a way to recover.
Not entirely convenient but it usually works for my purposes. The reason I'm running exclusively in cloud like this is because I want to minimize data management bothers that came with intra-day algos.
Petter Hansson
Think the biggest limitation with running on QC solely is that you can't do arbitarily CPU intensive stuff during QC backtests (like ML training), nor is there persistence (moderately hackish workarounds aside).
Jared Broad
Interesting discussion guys;
I'd love a way to automatically upload code to the cloud from Visual Studio. With our new API that is fairly easy to do. It could have a backtest button which opens a link in your default browser. If someone is interested in making this project I'd be happy to sponsor a $500 bounty or lifetime free subscription for a great open source submission.
@Petter, I'd love to understand those two limitations better:
- "you can't do arbitarily CPU intensive stuff during QC backtests (like ML training)" - do you get timeouts?
- "nor is there persistence (moderately hackish workarounds aside)." - e.g. for storing training data?
Those are both solvable; I didn't even think about persistance. Its a good idea -
Thanks
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
Keith K
My workflow is nearly identical to Peter Hansson.
Yes persistence would be a great feature.
2 massive projects on my to do list are 30-day 'wash sale' avoidance and 'tax loss harvesting'. This is going to require persistent data which tracks the 30-day wait period on a per-equity basis. This code will not be fun to write... especially tedious to integrate into algorithm strategy which prefers to trade often. Also, if it
Which broker makes it easiest to complete Schedule D at tax time using TurboTax or HRBlock ?
An alternative is 'mark-to-market' accounting. Has anyone here actually gone this route? I really wish there was a turn-key product that allowed me to create my own personal 'mutual fund', daily balancing a portfolio of ~40 stocks, and then turn-key 'mark-to-market' tax accounting at the end of the year. However, scouring the web didn't yield promising results.
Petter Hansson
Jared, with "arbitrarily CPU intensive stuff" I primarily mean operations like ML training batch jobs, that could take a lot of time potentially, a suppose a timeout would be the manifestation of that. I don't dare to spend time implementing some algorithms at present on QC because I know I can't scale CPU time in the cloud here if I run into that bound.
Even if I don't expect CPU to be a problem, persistance probably is. For my own requirements regarding persistance: the most simple thing that would help would be a virtual filesystem with read/write access to regular IO streams per user (seen by all projects) and/or per-project basis (and possibly per backtest, for extended logging), or a key value store with similar semantics. At first, only API access would be required, but it would of course be useful to see it in the GUI as well. Use case for me at this point is especially persisting large trained ML state; logging isn't suited for this at all, and writing to external sockets is obviously needlessly complicated with lots of things that can go wrong.
Petter Hansson
(An additional complication of course is that I would need some way to deploy a copy of the same data on the live VPS connecting to the brokerage.)
Andrew Hart
I'd like to point out a recent addition to Lean that makes backtesting locally more convenient. First get all the data you need from the QuantConnect Data Library. You don't have to download it, just make sure it's in your data library. In config.json, change the "data-file-provider" from "DefaultDataFileProvider" to "ApiDataFileProvider". Also be sure to include your "job-user-id" and "api-access-token". If you have all the data your algo requires in your QC data library, then the ApiDataFileProvider will reach into the cloud and download the data as needed to disk IF it doesn't already find the data on your disk.
Hemant Verma
Any idea how to import data into my data folder. I am getting error that I don't have any data in my data folder when using ApiDataFileProvider.
Jared Broad
You can request access to FX and CFD data from the QuantConnect Data Library. We're not allowed to share Equities data yet but we hope it will be available soon!
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
AlphAngel
regarding persistence, this is something that would be very useful and time saving, especially if you coudl share persited data/models between research and algo lab. You could imagine doing your research figuring out models or parameters and then reusing that in your backtests / live algos.
A simple usecase for low capacity storage persistence for me is: I have a algo with several signal type. If I stop and restart my live algo I would like to know what signal what used to open the position so that I can reuse the same signal to close it.
Wink Saville
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!