Historical Data

Custom Data

Introduction

You can import external datasets into your algorithm to use alongside other datasets from the Dataset Market. This page explains how to get historical data for custom datasets. Before you can get historical data for the dataset, define the get_sourceGetSource and readerReader methods of the custom data class. For examples of custom dataset implementations, see Key Concepts.

Slices

To request Slice objects of historical data, call the History method. If you pass a list of Symbol objects, it returns data for all the custom datasets that the Symbol objects reference.

// Get the latest 3 data points of some custom datasets, packaged into Slice objects.
var history = History(datasetSymbols, 3);

If you don't pass any Symbol objects, it returns data for all the data subscriptions in your notebook, so the result may include more than just custom data.

To request Slice objects of historical data, call the history method without providing any Symbol objects. It returns data for all the data subscriptions in your notebook, so the result may include more than just custom data.

// Get the latest 3 data points of all the securities/datasets in the notebook, packaged into Slice objects.
var history = History(3);
# Get the latest 3 data points of all the securities/datasets in the notebook, packaged into Slice objects.
history = self.history(3)

When your history request returns Slice objects, the Timetime properties of these objects are based on the notebook time zone, but the EndTimeend_time properties of the individual data objects are based on the data time zone. The EndTimeend_time is the end of the sampling period and when the data is actually available.

Data Points

To get a list of historical data points for a custom dataset, call the History<customDatasetClass> method with the dataset Symbol. For an example definition of a custom data class, see the CSV Format Example.

To get historical data points for a custom dataset, call the history method with the dataset Symbol. This method returns a DataFrame that contains the data point attributes of the dataset class. For an example definition of a custom data class, see the CSV Format Example.

public class CustomSecurityHistoryAlgorithm : QCAlgorithm
{
    public override void Initialize()
    {
        SetStartDate(2014, 7, 10);
        // Add a custom dataset and save a reference to it's Symbol.
        var datasetSymbol = AddData<MyCustomDataType>("MyCustomDataType", Resolution.Daily).Symbol;
        // Get the trailing 5 days of MyCustomDataType data.
        var history = History<MyCustomDataType>(datasetSymbol, 5, Resolution.Daily);
    }
}
class CustomSecurityHistoryAlgorithm(QCAlgorithm):

    def initialize(self) -> None:
        self.set_start_date(2014, 7, 10)
        # Add a custom dataset and save a reference to it's Symbol.
        dataset_symbol = self.add_data(MyCustomDataType, "MyCustomDataType", Resolution.DAILY).symbol
        # Get the trailing 5 days of MyCustomDataType data in DataFrame format.
        history = self.history(dataset_symbol, 5, Resolution.DAILY)
DataFrame of MyCustomDataType data.

If you request a DataFrame, LEAN unpacks the data from Slice objects to populate the DataFrame. If you intend to use the data in the DataFrame to create customDatasetClass objects, request that the history request returns the data type you need. Otherwise, LEAN will consume computational resources populating the DataFrame. To get a list of dataset objects instead of a DataFrame, call the history[customDatasetClass] method.

# Get the trailing 5 days of MyCustomDataType data for an asset in MyCustomDataType format. 
history = self.history[MyCustomDataType](dataset_symbol, 5, Resolution.DAILY)

If the dataset provides multiple entries per time step, in the get_sourceGetSource method of your custom data class, return a SubscriptionDataSource that uses FileFormat.UNFOLDING_COLLECTIONFileFormat.UnfoldingCollection. To get the historical data of this custom data type in a DataFrame, set the flatten argument to True.

history = self.history(dataset_symbol, 1, Resolution.DAILY, flatten=True)

Universes

To get historical data for a custom data universe, call the History method with the Universe object. For an example definition of a custom data universe class, see the CSV Format Example.

To get historical data for a custom data universe, call the history method with the Universe object. For an example definition of a custom data universe class, see the CSV Format Example.

public class CustomDataUniverseHistoryAlgorithm : QCAlgorithm
{
    public override void Initialize()
    {
        SetStartDate(2017, 7, 9);
        // Add a universe from a custom data source and save a reference to it.
        var universe = AddUniverse<StockDataSource>(
            "myStockDataSource", Resolution.Daily, data => data.Select(x => x.Symbol)
        );
        // Get the historical universe data over the last 5 days.
        var history = History(universe, TimeSpan.FromDays(5)).Cast<StockDataSource>().ToList();
        // Iterate through each day in the universe history and count the number of constituents.
        foreach (var stockData in history)
        {
            var t = stockData.Time;
            var size = stockData.Symbols.Count;
        }
    }
}
class CustomDataUniverseHistoryAlgorithm(QCAlgorithm):

    def initialize(self) -> None:
        self.set_start_date(2017, 7, 9)
        # Add a universe from a custom data source and save a reference to it.
        universe = self.add_universe(
            StockDataSource, "my-stock-data-source", Resolution.DAILY, lambda data: [x.symbol for x in data]
        )
        # Get the historical universe data over the last 5 days in DataFrame format.
        history = self.history(universe, timedelta(5))
DataFrame of universe data for a custom dataset.
# Count the number of assets in the universe each day.
universe_size_by_day = history.apply(lambda row: len(row['symbols']), axis=1)
time
2017-07-05    5
2017-07-06    5
2017-07-07    5
2017-07-08    5
2017-07-09    5
Name: symbols, dtype: int64

Missing Data Points

History requests for a trailing number of data samples return data based on the market hours of assets. The default market hours for custom securities is to be always open. Therefore, history requests for a trailing number of data samples may return fewer samples than you expect. To set the market hours of the dataset, see Market Hours.

You can also see our Videos. You can also get in touch with us via Discord.

Did you find this page helpful?

Contribute to the documentation: