Machine Learning
Key Concepts
Supported Libraries
LEAN supports several machine learning libraries. You can import these packages and use them in your algorithms.
Name | Version | Language | Import Statement | Example |
---|---|---|---|---|
TensorFlow | 2.16.1 | Python | import tensorflow | |
SciKit Learn | 1.4.2 | Python | import sklearn | |
Py Torch | 2.2.1 | Python | import torch | |
Keras | 3.3.3 | Python | import keras | |
gplearn | 0.4.2 | Python | import gplearn | |
hmmlearn | 0.3.2 | Python | import hmmlearn | |
tsfresh | 0.20.2 | Python | import tsfresh | |
Stable-Baselines3 | 2.3.2 | Python | from stable_baselines3 import * | |
fastai | 2.7.14 | Python | import fastai | |
Deap | 1.4.1 | Python | import deap | |
XGBoost | 2.0.3 | Python | import xgboost | |
mlfinlab | 1.6.0 | Python | import mlfinlab | |
Accord | 3.6.0 | C# | using Accord.MachineLearning; |
Add New Libraries
To request a new library, contact us. We will add the library to the queue for review and deployment. Since the libraries run on our servers, we need to ensure they are secure and won't cause harm. The process of adding new libraries takes 2-4 weeks to complete. View the list of libraries currently under review on the Issues list of the Lean GitHub repository.
Save Models
After you train a model, you can save it into the Object Store. In QuantConnect Cloud, we back up your Object Store data on QuantConnect servers. In local algorithms, your local machine saves the Object Store data. If you save models in live algorithms, save them at the end of the training method so you can access the trained model again if your algorithm stops executing. If you save models in backtests, save them during the on_end_of_algorithm
event handler so that saving multiple times doesn't slow down your backtest.
To view examples of storing library-specific models, see Popular Libraries.
Load Models
You can load machine learning models from the Object Store or a custom data file like pickle. If you load models from the Object Store, before you load the model into your algorithm, in the initialize
method, check if the Object Store already contains the model. To avoid look-ahead bias in backtests, don't train your model on the same data you use to test the model.
Library Errors
LEAN gracefully handles runtime errors that your code raises. It generally informs the location and cause of the error before the algorithm is terminated.
refers to a critical failure from third-party libraries, such as machine learning models, that LEAN doesn't handle.Most QuantConnect members experience the invalid memory access error (
) or division by zero error ( ) marked as "7 Killed" and "8 Killed" in the error message, respectively. The following sections explain some common errors and what causes them.Memory Issues
You may experience the following memory issues:
- Out of Memory (OOM): The process requires more memory (RAM or GPU memory) than is available, leading to a crash.
- Memory Leaks: If the program fails to properly release memoryy, it can eventually exhaust available resources.
- Invalid Memory Access: Bugs in custom C/C++ extensions, kernels, or libraries (for example, CUDA kernels for GPU acceleration) can cause misaligned memory access or invalid memory operations.
Numerical Instability
You may experience the following numerical instability errors:
- Division by Zero: This can occur due to input data with zeroes or during calculations, especially in custom loss functions or optimization algorithms.
- NaN or Infinity Values: These can arise due to data with NaN values, improper initialization, exploding gradients, or unstable mathematical operations.
Software Bugs
You may experience the following software bugs:
- Code Errors: Bugs in the implementation of the model, data preprocessing, or training loop can cause crashes.
- Library Incompatibility: Using incompatible versions of libraries (for example, TensorFlow or PyTorch) or dependencies can lead to fatal errors.