Machine Learning: Runtime Error: Algorithm took longer than 10 minutes on a single time loop

Bit of a newbie so apologies if I'm asking a silly question. Live deployment of a crypto machine learning algo, throwing up an error when I try and trigger the training function of the algo

Runtime Error: Algorithm took longer than 10 minutes on a single time loop. CurrentTimeStepElapsed: 0.0 minutes Stack Trace: System.TimeoutException: Algorithm took longer than 10 minutes on a single time loop. CurrentTimeStepElapsed: 0.0 minutes at QuantConnect.Isolator.MonitorTask (System.Threading.Tasks.Task task, System.TimeSpan timeSpan, System.Func`1[TResult] withinCustomLimits, System.Int64 memoryCap, System.Int32 sleepIntervalMillis) [0x002d3] in :0 at QuantConnect.Isolator.ExecuteWithTimeLimit (System.TimeSpan timeSpan, System.Func`1[TResult] withinCustomLimits, System.Action codeBlock, System.Int64 memoryCap, System.Int32 sleepIntervalMillis, QuantConnect.Util.WorkerThread workerThread) [0x00092] in :0 at QuantConnect.Lean.Engine.Engine.Run (QuantConnect.Packets.AlgorithmNodePacket job, QuantConnect.Lean.Engine.AlgorithmManager manager, System.String assemblyPath, QuantConnect.Util.WorkerThread workerThread) [0x009f0] in :0

User: 108607, Project: 5001223, Algorithm: L-886773484792a4b50dd08fc926320cd6

The code in question triggers at 3AM

def NeuralNetworkTraining(self):
        '''Train the Neural Network and save the model in the ObjectStore'''        
        symbols = list(self.modelBySymbol.keys())
        
        if len(symbols) == 0: 
            self.Debug("no contracts found")
            return 
        
        for symbol in symbols:
            try: 
                # Hourly historical data is used to train the machine learning model
                history = self.History(symbol, (self.lookback + self.timesteps), Resolution.Hour)
                self.Debug(history)
            except: 
                self.Debug("Failed to receive history")
            #history = self.x_scaler.fit_transform(history)

            if 'open' in history and 'close' in history and 'high' in history and 'low' in history: 
                history = np.column_stack((history['open'], history['close'], history['high'], history['low']))
                #history = np.column_stack((history['open']))

            if len(history) < self.lookback: 
                self.Debug("Error while collecting the training data")
                continue
            
            #history = list([i[0] for i in history])

            self.Debug("Start Training for symbol {0}".format(symbol))
            
            #First convert the data into 3D Array with (x train samples, 60 timesteps, 1 feature)
            x_train = []
            y_train = []
            for i in range(self.timesteps, len(history)): 
                x_train.append(history[i - self.timesteps:i])
                y_train.append([history[i][0]])
            
            x_train, y_train = np.array(x_train), np.array(y_train)
            x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 4))
            y_train = np.reshape(y_train, (y_train.shape[0], 1))
            if np.any(np.isnan(x_train)): 
                self.Debug("Error in Training Data")
                continue
            if np.any(np.isnan(y_train)): 
                self.Debug("Error in Validation Data")
                continue
            
            x_scaler = MinMaxScaler(feature_range=(0, 1))
            y_scaler = MinMaxScaler(feature_range=(0, 1))

            
            x_train = x_scaler.fit_transform(x_train.reshape(-1, x_train.shape[-1])).reshape(x_train.shape)
            #x_train = self.x_scaler.fit_transform(x_train)
            y_train = y_scaler.fit_transform(y_train)
            
            
            #self.Debug(x_train.shape)
            
            #self.Debug(y_train.shape)
            #self.Debug(y_train)
            # build a neural network from the 1st layer to the last layer
            '''
            model = Sequential()

            model.add(Dense(10, input_dim = 1))
            model.add(Activation('relu'))
            model.add(Dense(1))

            sgd = SGD(lr = 0.01)   # learning rate = 0.01

            # choose loss function and optimizing method
            model.compile(loss='mse', optimizer=sgd)
            '''
            if symbol in self.modelBySymbol and self.modelBySymbol[symbol] is not None: 
                model = self.modelBySymbol[symbol]
                iterations = 1
            else: 
                #If Model not exist for symbol then create one
                opt_cells = 5
                model = Sequential()
                
                model.add(LSTM(units = opt_cells, return_sequences = True, input_shape = (x_train.shape[1], 4)))
                
                model.add(Dropout(0.2))
                
                model.add(LSTM(units = opt_cells, return_sequences = True))
                model.add(Dropout(0.2))
                
                model.add(LSTM(units = opt_cells, return_sequences = True))
                model.add(Dropout(0.2))
                
                model.add(LSTM(units = opt_cells, return_sequences = False))
                model.add(Dropout(0.2))
                
                model.add(Dense(1, activation='linear'))
        
                adam = Adam(lr=0.001, clipnorm=1.0)
                model.compile(loss='mean_squared_error', optimizer=adam, metrics=['accuracy'])
                
                iterations = 50
            # pick an iteration number large enough for convergence 
            for step in range(iterations):
                # training the model
                #cost = model.train_on_batch(predictor, predictand)
                hist = model.fit(x_train, y_train,  epochs = 1) #verbose=0,
                acc = list(hist.history['accuracy'])[-1]
                loss = list(hist.history['loss'])[-1]

            
            self.scalersBySymbol[symbol] = (x_scaler, y_scaler)
            self.modelBySymbol[symbol] = model
            self.Debug("End Training for symbol {0} with accuracy {1}".format(symbol, acc))

According to the log, my algo pulls the data and then starts the training. The runtime error is then generated.

Is this simply a function of the fact I'm on the $20/month plan with limited ML time? Do I need to move to the algo with LEAN deployed on a beefier server?

Appreciated your time.

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.

How often is this function being called? For the $20/mo plan, there's a leaky bucket allocation of 60 min + 10 min/24 hrs. Based on the number of iterations, training this model every week during live deployment should be okay - any more frequently and you'll likely hit the limit. Keep in mind though that training resources are shared across all calls to scheduled functions.

That being said, Jared mentioned there are some exciting changes today or tomorrow to make cloud resources more scalable which should get rid of these limits.

Thanks Adam. Was calling it every night, but will dial it back to once/week.

Look forward to seeing what Jared announces, wasn't aware this was coming up.

Adam W

3.9k ,

Ryan McMullan

575 ,

Jared Broad

STAFF ,

=D It's very exciting for us too! We're working on it. We're pushing for this week.

@Ryan are you using a Train() method? It looks like you're just using a scheduled event. Please attach the code, or representative code so we can give you better assistance.

Great news Jared, looking forward to the announcement

Was calling my ML Training script (linked in the OP post) through the Train method.

Would it be more efficient to actually place it inside the Train() method?

self.Train( self.DateRules.Every(DayOfWeek.Friday), self.TimeRules.At(3, 0), self.NeuralNetworkTraining)

Shile Wen

63.5k ,

Hi Ryan,

self.Train is just as efficient as other methods, but the main reason to do it is so that the engine knows to allow a certain method to take longer than 10 minutes to finish its computations.

For more details, please read this page.

Best,
Shile Wen

Thanks Shile for your answer.

Looking at my code and the documentation, it appears I'm calling it correctly. Will wait to see about the scalabilitiy of the QC hardware options

Ryan McMullan INVESTOR

Update Backtest

Notebook

person upvoted this people upvoted this

To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!

Platform

Radically Open-Source Algorithmic Trading Engine

Join Our Discord Channel

Quarterly Open-Source Trading Competition

Draft Discussions

Bookmarked Discussions

SEARCH DISCUSSIONS

TOP 5 Research PUblications

About Quant League

competition rules

previous competitions

333,200 Quants.

VOTE FOR UPCOMING FEATURES

Machine Learning: Runtime Error: Algorithm took longer than 10 minutes on a single time loop

Organization

Team

Clone Strategy

Previous Ranking

IN THIS RESEARCH

PARTICIPANTS

Discussion Awards

Actions

Join QuantConnect for Free

Platform

SIGN IN

Radically Open-Source Algorithmic Trading Engine

Join Our Discord Channel

Quarterly Open-Source Trading Competition

Draft Discussions

Bookmarked Discussions

SEARCH DISCUSSIONS

TOP 5 Research PUblications

About Quant League

competition rules

previous competitions

333,200 Quants.

VOTE FOR UPCOMING FEATURES

Machine Learning: Runtime Error: Algorithm took longer than 10 minutes on a single time loop

Organization

Team

Clone Strategy

Previous Ranking

IN THIS RESEARCH

PARTICIPANTS

Discussion Awards

SHARE RESEARCH

SHARE DISCUSSION

SHARE ARTICLE

SHARE

Actions

Join QuantConnect for Free