Newbie question, I'm sure. Can't seem to find an explanation in the documentation.
I'm comparing the prices and volume for IBM stock between NASDAQ, Yahoo Finance, QuantConnect documentation, and QuantConnect execution on raw data. Here are my observations for IBM on Apr 18, 2019:
- NASDAQ says the close was 134.6455 and the volume was 12,513,600.
- Yahoo says the close was 132.99 and the volume was 13,101,882.
- QuantConnect documentation says the close was 139.08 and the volume was 6,605,559.
- QuantConnect execution on raw data (see attached backtest code) says the close was 139.11 and the volume was 12,183,741.
I understand that QuantConnect does data adjustment by default. But when I switch the data normalization mode to raw shouldn't I get prices and volumes that match NASDAQ and Yahoo Finance? Why am I getting all different values and what is true? Hmm..
NASDAQ:
Yahoo:
QuantConnect documentation:
QuantConnect execution on raw data:
Cole S
I would suggest setting DataNormalizationMode to TotalReturn and see what you get.
Jason Annable
Thank you Cole. I had read that article but I don't think it explains what I'm seeing. (More likely, I'm not understanding something properly.) I used TotalReturn and get different numbers, still. Observation:
Mak K
Hi Jason,
I would suggest reading this;
https://www.quantconnect.com/docs/v2/our-platform/user-guides/datasets/misconceptions
Thanks!
Jason Annable
Timestamp differences isn't the reason I don't think. And I've tried all the data normalization modes but none of them match.
It's probably the cross-platform discrepancies, but I'm not 100% sure. It looks like Yahoo uses BATS but I'd be surprised if the NASDAQ source uses BATS too. And either way, it doesn't explain why there'd be a difference between all 3 sources. Plus, we're talking historical values, not real-time. Why would historical values - especially volume - be so different?
Ultimately, it's not stopping me from continuing since the graphs seem to still match what I'm seeing in TradingView. The values are slightly off all throughout though. (I'm aware that TradingView - at least the version that I'm paying for - is using BATS, and I'm able to match it against Yahoo data, which is what I'd expect.)
Fred Painchaud
Hi Jason,
There are differences between all 3 sources because all 3 sources use 3 different sources. Across different sources, the metric which will vary the most is volume as different sources DO NOT see ALL volume for ALL assets. Some sources see as little as a fraction of a percent of the entire volume for some types of assets.
Here it is for instance for BATS/CBOE:
https://www.cboe.com/us/equities/overview/
I'm sure you understand what it means when you use that kind of data to trade, like in TV with the free/included data (instead of the paid data add-ons)…
Fred
Jason Annable
Thank you Fred. That is helpful.
I tried reading up on the data provider for QuantConnect to assure myself that the data is reliable. However, the page that talks about it has a bunch of dead links. And when I search for QuantQuote, it seems to be dead. Not sure what replaced it. Anyone know who the data provider is and where I can read up for more information? I'm also wondering if different pricing tiers results in different data. (I seem to recall TradingView's pricing model worked like this.)
(Side rant: I find I'm often frustrated with QuantConnect's documentation style. It often seems a little too thin, outdated, inconsistent, or unsearchable.)
Mak K
Hi Jason,
This is not a 100% guarantee but I'm suspecting that it is IgoSeek
https://www.quantconnect.com/datasets/algoseek-us-equities
And as far as I know the free and others tiers receive the same data.
Also thank you for your patience with the documentation, it is currently being reworked and hopefully that will be completely be early 2022!
Thanks!
Fred Painchaud
Hi Jason,
You can check out https://www.quantconnect.com/datasets for an overview of datasets (both free and paid) used by QC.
Re the documentation, it is being worked on right now to publish a second, updated version. I'm pretty sure it will still be imperfect. But it should be better.
Fred
Jared Broad
Hi Jason Annable it's impossible to know the differences between the other public sources you published (NASDAQ and Yahoo) as they don't publish precisely how they calculate the volume figures, however, we're working on publishing our method to give some clarity for users who are curious.
TLDR it's not a simple process and requires filtering some trades which are reported late or traded off the exchanges. We prioritize a backtest that looks like live-trading above matching third-party sources.
See these posts and comments for more detailed charts and breakdowns on bar creation:
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
Jason Annable
Thank you Jared. That is very helpful. I also just happened to read something in a book that addresses my concern quite well. The link is to content from Anna Coulling's book.
Fred Painchaud
Hi Jason,
Thanks for sharing those two pages. I enjoyed reading them.
The “poor Anna Coulling” (in the sense that I have empathy for her) will have fully beaten her demon once she also realizes that:
1- Volume feeds, and in fact all data feeds, are not only inaccurate but even also inconsistent.
2- Even though they are inconsistent, the “art of trading and VPA”, as she calls it (it made me smile since her very text proves it is much more a science to her than an art), is not even impacted by inconsistencies - contrary to her fear - as long as those inconsistencies don't become statistically significant. This “art” is a statistical art, and thus it is not about perfect accuracy (wrt the reality - the “truth”) - she got that one - but also not even about perfect consistency. Statistics are designed to deal with inaccuracies and inconsistencies.
Anecdote: I've seen many totally mathematically-inconsistent trading strategies that are still profitable on the long term simply because they happen to still be statistically reasonable enough. Oftentimes, the authors are not even aware their strategies make no sense mathematically speaking and many even go “well, it has been and is still profitable so I am happy” when made aware of it.
Fred
Jason Annable
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!