I am training an LSTM model (the architecture is very simple). For whatever reason, and randomly, usually after 15 or so epochs, i get this message that the kernel has died and needs to restart.
So this isn't due to inactivity
I have an R4 instance type.
What is the problem?
And when will Tensorflow 2.4 be available?
Jared Broad
It's always down to 2 things: out of memory, or TensorFlow explosion. Avoid posting +1's/bumps, please. The forum is supported by the community and responses can take 2-7 days.Â
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
Nathan Hanks
When you say "Tensorflow explosion" are you referring to gradient explosion? If I'm not seeing that in the loss calcs by epoch are you saying that I need a larger instance?
Â
Shile Wen
Hi Nathan,
Tensorflow explosion would be gradient explosion. Although the loss may give reasonable values, if the weights are on the edge of being too small or too large, if they are nudged, then the numerical instability can cause overflow/underflow.Here are some suggestions to address the out-of-memory issue:
Also, please keep in mind if the model causes memory issues, then there could be overfitting.
Best,
Shile Wen
Valery T
Hi Shile,Â
>>if the model causes memory issues, then there could be overfitting
Could you explain?
Shile Wen
Hi Valery,
If the model needs so much resources that training causes memory issues, then the model could be overly complex, meaning it is affected heavily by slight noise. Given that the markets are very noisy, we'd prefer a biased model (see Bias-Variance Tradeoff) to capture general movements.
Best,
Shile Wen
Nathan Hanks
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!