Hi,
I'm bumping into the same issue here again that the same algorithm are drastically different results. I notice there's a thread here saying the data were amended.
Is there any other data have been fixed so that rendering the same algorithm having two different results?
And also, is this a thing that we need to keep an eye on as well? As an independent algo trader, I'm highly relying on the quality of the data in order to produce the strategy in an efficient way. This data issue seems will sabotage our work as we falsely using the strategy the suppose to be profitable to in the end it's not.
Thank you!
Michael Hsia
Attaching the backtest result here for reference:
Result run on Sep 29th
Result run on Oct 6th
Nguyen Huu Quan
+1. I also starting seeing different results for my old backtests since a few days ago.
Nguyen Huu Quan
This is one example. Before:
Nguyen Huu Quan
After
Alexandre Catarino
Hi Micheal and Nguyen,
Thank you for the report, and sorry about the wait.
Since the shared backtests do not include the orders, could you please attach them (e.g., Dropbox link)? We would like to compare them to understand the problem. In principle, it's not caused by a change in Lean, However, if you could also provide the Lean version, it could potentially help. The Lean version is found in the Logs Tab:
We have made a Sampling and Statistics Update that might explain @nguyen-huu-quan 's case since the statistics are similar.
Additionally, we have already checked all the nodes, and all of them have the corporate events data (splits, dividends, and symbol change) that might be the source of the problem.
Best regards,
Alex
Nguyen Huu Quan
Thanks Alex. My “Before” backtest was run with: Launching analysis for bc759a793fabe581faf0b21cbf31adf8 with LEAN Engine v2.5.0.0.12841. My “After” was run with: Launching analysis for 32806ba5d4ae277479629658b3950bae with LEAN Engine v2.5.0.0.13032
Here are the before trades:
And After trades:
Nguyen Huu Quan
What I noticed is the differences are caused by different prices used to fill the trades for certain symbols. And look like that price is calculated from trade bars, so maybe something has changed in the trade data or the way we calculate the fill price.
Alexandre Catarino
Hi Nguyen Huu Quan ,
I performed a quick file comparison, and I couldn't spot any major differences. These findings are coherent with the backtest results that are slightly different.
The differences are explained by dividend payments (MOT, PHM, AHC, etc) that occurred between the executions that are about one month apart (Lean v2.5.0.0.12847 was released on Sept 7th and the backtest ran on v2.5.0.0.12841) that adjusted the historical prices. Given that the minimum lot size is 1, the order volume is slightly lower, e.g.:
PHM had a $0.14 per share dividend payment on Oct 5th.
Nguyen Huu Quan
Thanks Alexandre Catarino Shouldn't the back test use unadjusted price, or adjusted price up to the point of the fill only ? Because at the time of that fill we should not have knowledge about the dividend payment on Oct 5th.
The main challenge I face with this diff is because the price will change once in a while due to dividend adjustment, my backtest results will change and it becomes a bit hard to compare backtests and see if a change would improve or worsen the result. Compounding over a long period the result will be more visible as well.
Michael Hsia
Hi Alex,
Thanks for helping out. Here are my order lists of these two backtests.
LEAN Engine v2.5.0.0.12987
LEAN Engine v2.5.0.0.13032
Michael Hsia
Hi Alexandre Catarino
Any further insight can be extracted from the documents that I attached? Would be helpful if you can help point out something.
Appreciated.
Alexandre Catarino
Hi Micheal,
Sorry about the wait.
Unfortunately, your case is not as simple as Nguyen's. If we compare the orders using any diff tool (I used Notepad++ Compare Plugin), On the first day, Jan 7th, 2000, only two trades in 40 are different:
However, the old version trades about 20 securities from Jan 8th, 2000 to Jan 10th, 2000 that new version does not. It could be due to the influence of different trades on the first day. These sorts of changes are hard to explain based on the orders because if there is some sort of filtering/ranking, we will not get data issues that conditioned the selection.
In the original post, you have asked:
“Is this a thing that we need to keep an eye on as well?”
QuantConnect does provide quality data since we work with a high-quality data vendor, AlgoSeek. On top of AlgoSeek's work, QuantConnect has a Support Team that works closely with the Data Team that will address data issues reported by a community of thousands of members (it is a lot of people looking at the data!). Unfortunately, as we all know, data curation is not a trivial task. So, I would say “yes, we need to keep an eye on the data as well” and watch the signs. For example, the old version generated 7895 invalid orders out of 8153, and it could have been a sign of data issues.
Hi Nguyen Huu Quan ,
You can use unadjusted price (DataNormalizationMode.Raw) if it's the best option for your algorithm. For example, if it doesn't use indicators. On the other hand, if the best option is the default behavior (DataNormalizationMode.Adjusted), then the whole price series will be adjusted.
In your case, the difference in the results should not trouble you. The algorithm places the same trades, thus it makes the same predictions. It only trades different volumes. If we change the starting cash from $100,000 to $200,000 or $50,000, we will see the same trades, but different volumes, which doesn't mean that the results are “different”, at least, not fundamentally different. In any case, it's not a “data issue”.
Michael Hsia
Hi Alexandre Catarino
I'm wondering is there anything else that I can provide you to find out the root cause? I do understand that the adjusted price or the dividend paid could cause a slight change of numbers in the backtest result. But having these big differences in these two backtest results confuses me, and I'm doubting the work that I have completed so far. It would be a relief if you can help me to solve this together so that this question won't haunt me during the night.
Again, thank you so much for spending time with me on this. :)
Nguyen Huu Quan
Thanks Alexandre Catarino My main algo uses absolute price to decide some logic and is heavily affected. The attached one is just for illustration. I found a different way to avoid this issue. It's ok now.
Michael Hsia It may be worth checking your algo to see if in your logic you are using absolute price instead of return. An example is if you have an algo that would select the top 10 stocks with highest absolute price, that would likely be affected.
Michael Hsia
Nguyen Huu Quan
Thanks, in my script I don't use absolute price while selecting stocks.
My question being, that since there is nothing changed code-wise, a few price-adjusted events shouldn't impact the performance of the entire portfolio between the backtest StartTime & EndTime.
Jared Broad
“a few price-adjusted events shouldn't impact the performance of the entire portfolio between the backtest StartTime & EndTime”
This is not correct; adjusted historical prices change when there's a new split or dividend. They are always the current price of the asset “now” and will have different prices on historical moments. These historical prices change as new dividends arrive in the “now”. If the model is sensitive to small price changes please use raw pricing.
I recommend both of you send in a support ticket (support@quantconnect.com or ideally via the support system). You likely don't have the same issue and discussing code issues with screenshots is not constructive. Support tickets have code timestamped and allow us to ensure no code changes have happened to cause differences. The most common cause for backtest changes is people changing the code! 😊
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
Michael Hsia
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!