Key Concepts

Debugging Tools

Introduction

When your algorithm throws errors or doesn't operate how you expect, use debugging tools to diagnose and resolve the issue. The tools to debug LEAN algorithms include a built-in debugger, logs, charts, and the Object Store.

Debugger

The debugger is a built-in tool to help you debug coding errors while backtesting. The debugger enables you to slow down the code execution, step through the program line-by-line, and inspect the variables to understand the internal state of the program. The learn how to start and operate the backtest debugger in your development environment, see the Debugging page in the Cloud Platform, Local Platform, or CLI documentation.

Logs

Algorithms can record string messages ('log statements') to a file for analysis after a backtest is complete, or as a live algorithm is running. These records can assist in debugging logical flow errors in the project code or recording key decision moments. Backtests can process millions of data points. If you place a log statement in the wrong place, it can generate gigabytes of logs, which can be difficult to sift through. When you add to the log file, ensure they are readable.

// Logging records messages to a file so you can debug errors and record key decisions.
Log($"Signal triggered. Placing limit order: {limit}");
# Logging records messages to a file so you can debug errors and record key decisions.
self.log(f"Signal triggered. Placing limit order: {limit}")

Debug statements are the same as log statements, but Debugdebug method statements are orange in the Cloud Terminal to provide more emphasis. Debug statements are streamed to the Cloud Terminal while the backtest is running. This can be helpful for real-time updates or monitoring in live-trading.

// Debugging streams messages to the Cloud Terminal for real-time analysis.
Debug($"Exposure to the {sectorName} sector is above 50%!");
# Debugging streams messages to the Cloud Terminal for real-time analysis.
self.debug("Exposure to the {sector_name} sector is above 50%!")

For more information about log statements, see Logging.

Charts

Plotting values over time can help you understand its range, identify outliers in data, and see times when your algorithm generates unusual values. Before you create a plot, develop a theory about what you expect to see before you view it to gain a deeper understanding of your algorithm. For example, a plot of an asset's price standard deviations should be between -4 and +4, with any outliers happening during extreme market conditions like the COVID pandemic in March 2020. LEAN provides a powerful charting API. You can create time series plots with just a single line of code.

// Create time series plots to understand the asset range and identify outliers.
Plot("Deviations", "aapl", assetDeviation);
# Create time series plots to understand the asset range and identify outliers.
self.plot("Deviations", "aapl", asset_deviation)

For more information about creating charts, see Charting.

Object Store

The Object Store is a key-value data store for low-latency information storage and retrieval. During a backtest, you can build large objects you’d like to analyze and write them for later analysis. This workflow can be helpful when the objects are large and plotting is impossible or when you want to perform analysis across many backtests.

// The Object Store provides a key-value data store optimized for fast access and retrieval of information.
ObjectStore.Save("key", value);
# The Object Store provides a key-value data store optimized for fast access and retrieval of information.
self.object_store.save("key", value)

For more information about the Object Store, see Object Store. For a specific example of saving indicator values during a backtest into the Object Store and then plotting them in the Research Environment, see Example for Plotting.

Profiling Speed

Python Profilers from the Python Standard Library enable you to find the functions and methods in your algorithm that consume most of the runtime. To analyze your algorithm's efficiency with the Python Profilers, follow these steps:

  1. Open the project in the Cloud Platform, Local Platform, or a local IDE if you use the CLI.

  2. At the top of the main.py file, import the classes you need and enable the Profile.
  3. from cProfile import Profile
    from pstats import Stats
    from io import StringIO
    
    # Create the profile and start collecting profiling data.
    profile = Profile()
    profile.enable()
  4. Define or extend the on_end_of_algorithm method to disable the Profile and save information on the most time-consuming functions to the Object Store.
  5. class FunctionTimeConsumptionAlgorithm(QCAlgorithm):
        def initialize(self):
            pass
    
        def on_end_of_algorithm(self):
            # Stop collecting profiling data
            profile.disable()
            stream = StringIO()
            # Save the top 20 time-consuming functions to a file in the Object Store.
            Stats(profile, stream=stream).sort_stats('cumulative').print_stats(20)
            # Save the profiling data to the Object Store using the algorithm ID.
            self.object_store.save(f"{self.algorithm_id}_profile", stream.getvalue())
  6. Open the Research Environment and create a QuantBook.
  7. qb = QuantBook()
  8. Get the backtest Id and read the profiling data from the Object Store.
  9. The process to get the backtest Id depends on if you use the Cloud Platform, Local Platform, or CLI.

    backtest_id = "8b16cec0c44f75188d82f9eadb310e17"
    profile_output = qb.object_store.read(f"{backtest_id}_profile")
    print(profile_output)

JIT Compilation

To increase the speed of some methods, you can use decorators like @jit or @lru_cache. The @jit decorator from Numba is for Just-In-Time (JIT) compilation. It compiles Python code to machine code at runtime to increase the speed of loops and mathematical operations. However, Numba can't compile all Python code. It performs best with NumPy arrays. If you add the @jit decorator to your methods, it can make the debugging process more challenging since the code is compiled to machine code. The following code snippet shows an example of using the @jit decorator:

import numpy as np
from numba import jit
import time

# Without JIT
def slow_function(arr):
    result = 0
    for i in range(len(arr)):
        result += np.sin(arr[i]) * np.cos(arr[i])
    return result

# With JIT
@jit(nopython=True)
def fast_function(arr):
    result = 0
    for i in range(len(arr)):
        result += np.sin(arr[i]) * np.cos(arr[i])
    return result

# Example usage
arr = np.random.rand(1000000) 

# Test without JIT.
start = time.time()
slow_function(arr)
print(f"Slow function took: {time.time() - start} seconds")

# Test with JIT.
start = time.time()
fast_function(arr)  # This will compile first, then run.
print(f"Fast function took: {time.time() - start} seconds")

Slow function took: 1.1557221412658691 seconds

Fast function took: 0.32973551750183105 seconds


The @lru_cache decorator from the functools module is for Least Recently Used (LRU) caching. It caches the result of a function based on its arguments, so that the function body doesn't have to repeatedly execute when you call the function multiple times with the same inputs. The maxsize argument of the decorator defines the number of results to cache. For example, @lru_cache(maxsize=512) means that it should cache a maximum of 512 results. If you add the @lru_cache decorator to your methods, make sure your function is completely dependent on the arguments (not any global state variables) or else the caching process may lead to incorrect results. The following code snippet shows an example of using the @lru_cache decorator:

from functools import lru_cache
import time

# Without caching
def fibonacci(n):
    if n < 2:
        return n
    return fibonacci(n-1) + fibonacci(n-2)

# With caching
@lru_cache(maxsize=512)
def fibonacci_cached(n):
    if n < 2:
        return n
    return fibonacci_cached(n-1) + fibonacci_cached(n-2)

n = 30

# Test without caching.
start = time.time()
for _ in range(n):
    fibonacci(n)
print(f"Fibonacci without cache took: {time.time() - start} seconds")

# Test with caching.
start = time.time()
for _ in range(n):
    fibonacci_cached(n)
print(f"Fibonacci with cache took: {time.time() - start} seconds")

Fibonacci without cache took: 4.387492418289185 seconds

Fibonacci with cache took: 9.322166442871094e-05 seconds

You can also see our Videos. You can also get in touch with us via Discord.

Did you find this page helpful?

Contribute to the documentation: