Hello All,
When I test using dates in 2014, as provided in some examples, my CoarseSelectionFunction returns records as expected.
However, when I change the date to a later date, such as in 2018 or 2019, I get 0 records.
I think that the default Dataset is “US Coarse Universe Dataset by QuantConnect”.
For example with the following code I get back 7120 records in my coarse universe:
public override void Initialize()
{
UniverseSettings.Resolution = Resolution.Minute;
SetStartDate(2014, 04, 02);
SetEndDate(2014, 04, 02);
SetCash(50000);
AddUniverse(CoarseSelectionFunction);
}
Then, for the following I get back 0 records in my coarse universe:
public override void Initialize()
{
UniverseSettings.Resolution = Resolution.Minute;
SetStartDate(2016, 04, 06);
SetEndDate(2016, 04, 06);
SetCash(50000);
AddUniverse(CoarseSelectionFunction);
}
I have tried dates in later years, 2018, 2019, with the same result, no data.
What am I doing wrong? Thoughts and ideas?
Thanks,
Eric
Fred Painchaud
Hi Eric,
What is the code for your CoarseSelectionFunction() method?
Have you changed your code so you have more than 1 day of backtest? Maybe…
Fred
Eric Schmidt
Thanks for looking.
I just tried this:
public override void Initialize()
{
UniverseSettings.Resolution = Resolution.Minute;
SetStartDate(2018, 04, 10);
SetEndDate(2018, 04, 12);
SetCash(50000);
AddUniverse(CoarseSelectionFunction);
}
// sort the data by daily dollar volume and take the top 'NumberOfSymbolsCoarse'
public IEnumerable<Symbol> CoarseSelectionFunction(IEnumerable<CoarseFundamental> coarse)
{
int howManyTotal = coarse.Count();
// select only symbols with fundamental data and sort descending by daily dollar volume
var sortedByDollarVolume = coarse
.Where(x => x.HasFundamentalData)
.OrderByDescending(x => x.DollarVolume);
howManyTotal = sortedByDollarVolume.Count();
// take the top entries from our sorted collection
var topCoarse = sortedByDollarVolume.Take(NumberOfSymbolsCoarse);
// we need to return only the symbol objects
return topCoarse.Select(x => x.Symbol);
}
Feels more like I just do not understand the Dataset. Is it possible that this Dataset is only for backtesting all the way back in 2014?
Anything that I do in later years does not return any values to coarse in the parameter to the CoarseSelectionFunction
Fred Painchaud
Not correctly understanding the dataset is possible but honestly, I don't see what we would not understand here. Your code looks ok, at least to me. Of course, the value of NumberOfSymbolsCoarse needs to be set somewhere. But i'm guessing it is as it works in 2014.
Are you saying that when set later than 2014, the first line in the coarse selection returns 0, that is, int howManyTotal = coarse.Count();, you get howManyTotal == 0? Or is it the second howManyTotal that becomes == 0? Or even, is it topCoarse that is empty?
Sorry I cannot directly point you to something wrong as it is as much a surprise to me as it is to you. But I really doubt there is something wrong with the dataset after 2014…. I've seen multiple working examples of Universe Selection where the dates used were in 2019, for instance. The only thing I'm spotting which is different right now are your start and end dates. Ok, I totally agree that it should not make a difference. But just for the sake of it, would you try something like 2018, 1, 1 - 2019, 1, 1? Only to see if THEN you have some output. If so, I believe that you could then fill a bug report, like, it works for a year but not for 2 days. You could even fiddle a bit with the dates then, to see when it starts to work, like, try a week, a month, 3 months, 6 months, etc. We never know…
The type of things I do when I really don't know what is going on 😊…
F
Eric Schmidt
Correct. At the top in the class I have:
private const int NumberOfSymbolsCoarse = 6000;
But it never gets to the point where it might limit to that.
When the parameter coarse first enters the method:
public IEnumerable<Symbol> CoarseSelectionFunction(IEnumerable<CoarseFundamental> coarse)
{ …
I can inspect it and see its value. When I am using the 2014 dates, I get:
Results View = Expanding the Results View will enumerate the IEnumerable
and I can see the collection:
When I change to a later year, I get:
Empty = "Enumeration yielded no results"
Fred Painchaud
Weird. ¯\_(ツ)_/¯
You can wait for someone else to answer here or fill a bug report. I guess that bug reports should be done here:
https://github.com/QuantConnect/Lean/issues
Do some screening first to make sure your bug is not already known/reported…
If it is not a bug and simply something we don't understand, you'll know pretty quick. If so, I would recommend that you report here so the solution/answer is included in the thread.
F
Eric Schmidt
So it looks like I will need to use the web UI in order to have access to data.
I will need to rethink how I will go about this. I did reach out to Support. I have learned that free cloud access to US Coarse Universe Dataset by QuantConnect means when I run the algorithm here in the web UI.
I jumped right to using Visual Studio because I like the IDE and I liked the idea of local control of my code, logging, notifications, debugging etc.
I have been able to test in the LeanConnect solution with both C# and Python algorithms. This worked because the example algorithms were coded with the dates for available data. LeanConnect running locally only has access to two days of coarse fundamental data.
Fred,
In a different thread you said:
If you add your LEAN CLI folder(s) INSIDE the LEAN engine VS solution, as sub-project(s), you can then use your IDE to develop your algorithms prepped by LEAN CLI. Moreover, if you go to the extent of setting your IDE to interact with LEAN CLI, you can then start LEAN CLI commands from your IDE to push your algorithms to the Docker instance and/or your cloud account (of course, on top of having the possibility to simply run the project - after proper .json setup - to run your algo in the solution itself). You are then in front of something I find really powerful.
It sounds like you are doing it correctly. If I understand, you are coding locally, but the LEAN CLI takes your algorithm and runs it in the cloud. This should give you the same access to available data as you would have when coding in the web UI. Is this correct?
I think that I jumped right to doing it as though I was a hedge fund and needed a robust on-prem platform. To do this would require me to get a “Security Master subscription for $600/year” and also to buy data for each security that my universe selection subscribes to.
I need to figure out what my starting point is.
Can I use only free data? What can I do as a one-person operation? Will I be able to subscribe to a universe of stocks an analyze a rolling window of TradBars to identify patterns of interest? What if I want to start with a coarse universe of thousands of securities and look for patterns? Is this available to me?
Any guidance is appreciated.
Eric
Fred Painchaud
😊
That's quite a few questions.
1- Thanks for coming back here to report what you have been told. I did not know about that and I guess it explains why it did not work - I am surprised it worked for 2014 however. Just a side note here - does it mean you were like lucky enough to simply pick the only two days for which you had free data available locally?????
2- I have written little code so far. I kinda hate doing something I don't at least understand to some extent before I start. So I've been wondering around in the LEAN source code much more than writing and running anything. And the little code I ran did not involve Universe - I was always picking “SPY" manually… So I did not run into your error.
3- You can use LEAN CLI to generate random local data (in the Docker instance). It is generated from brownian motion and is said to be realistic. I actually like the idea of backtesting on real assets' past data and randomly generated data. Still unsure if it helps or not but time will tell. Anyway, my point here is that MAYBE you can generate random data that can be used for Universe Selection. I really don't know. Maybe, maybe not. From the testing I've done, it takes a while to generate….. you'll see if you try it.
4- You are correct with LEAN CLI. You can develop locally and push your algorithms to your account for backtesting there. You will need to have an account that includes LEAN CLI. Even the cheaper option includes it.
5- Think twice, in my humble opinion, about “robust on-prem platform”. From the poking around I've been doing, QC runs the live algorithms (live trading) on servers inside Equinix. I am not saying your hedge fund should run on a third-party 😊 but from a robustness perspective, managing say your personal portfolio, it would be difficult to beat any serious high availability service, such as Equinix.
6- I certainly did not go through the entire Internet 😊 but the “best” source of free data I've found is Yahoo Finance. And you certainly don't have everything there, that's why I am saying “best”. Moreover, you need to scrape it. There are libraries for that in multiple languages, but be prepared to fight / do the cat-and-mouse game - you won't scrape them Gigs/hour like a walk in the park. Speaking of robustness……………… So, I believe you will need to pay “for data” - not necessarily just for data, but some subscription that gives you data AND other things. Like the minimal subscription here that gives you a lot of data included and an engine that looks good.
7- On other chats, I know people, alone, running dozens of bots trading dozens of different cryptos 24/7. That's one example of what you can do as a single person operation. It's not my bag of tea but it's just one example.
8- If you keep thousands of assets in your Universe and also want to look at say second/minute data, you will certainly hit the performance problem for your backtests. They will take hours and hours. I would highly recommend that you first think about something you want to try, usually called “some edge”, so a strategy that you think could do well, implement it and then test on some Universe that makes sense for that strategy. With say, a few dozens assets in that Universe. I do believe going the other way around, as in finding patterns in thousands of assets and trade those patterns, is a recipe to be disappointed. But it is only my opinion.
My 2 cents worth…
Cheers,
F
Eric Schmidt
In the sample algorithms that come as a part of Lean-master.zip, they used these dates as the examples. For example in ..\Algorithm.Python\CoarseFineFundamentalComboAlgorithm.py you will see:
def Initialize(self):
self.SetStartDate(2014,1,1) #Set Start Date
self.SetEndDate(2015,1,1) #Set End Date
Based on Support telling me that just a couple of days are available, I assume that they provided the data for those dates.
Eric Schmidt
Thanks a lot for all of the detail. I think that I will plan around local code focused on a small list of securities and running through CLI. Seems like the best starting point.
Jared Broad
Check out this video Eric Schmidt It's pretty cheap to download (IMHO) compared to an hour of your time. If you only download 1 year of coarse data it would be $12.60 for the coarse data and $600 for a 1-year subscription to the Security Master.
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
Eric Schmidt
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!