Historical Data
Custom Data
Introduction
You can import external datasets into your algorithm to use alongside other datasets from the Dataset Market.
This page explains how to get historical data for custom datasets.
Before you can get historical data for the dataset, define the get_source
GetSource
and reader
Reader
methods of the custom data class.
For examples of custom dataset implementations, see Key Concepts.
Slices
To request Slice
objects of historical data, call the History
method.
If you pass a list of Symbol
objects, it returns data for all the custom datasets that the Symbol
objects reference.
// Get the latest 3 data points of some custom datasets, packaged into Slice objects. var history = History(datasetSymbols, 3);
If you don't pass any Symbol
objects, it returns data for all the data subscriptions in your notebook, so the result may include more than just custom data.
To request Slice
objects of historical data, call the history
method without providing any Symbol
objects.
It returns data for all the data subscriptions in your notebook, so the result may include more than just custom data.
// Get the latest 3 data points of all the securities/datasets in the notebook, packaged into Slice objects. var history = History(3);
# Get the latest 3 data points of all the securities/datasets in the notebook, packaged into Slice objects. history = self.history(3)
When your history request returns Slice
objects, the Time
time
properties of these objects are based on the notebook time zone, but the EndTime
end_time
properties of the individual data objects are based on the data time zone.
The EndTime
end_time
is the end of the sampling period and when the data is actually available.
Data Points
To get a list of historical data points for a custom dataset, call the History<customDatasetClass>
method with the dataset Symbol
.
For an example definition of a custom data class, see the CSV Format Example.
To get historical data points for a custom dataset, call the history
method with the dataset Symbol
.
This method returns a DataFrame that contains the data point attributes of the dataset class.
For an example definition of a custom data class, see the CSV Format Example.
public class CustomSecurityHistoryAlgorithm : QCAlgorithm { public override void Initialize() { SetStartDate(2014, 7, 10); // Add a custom dataset and save a reference to it's Symbol. var datasetSymbol = AddData<MyCustomDataType>("MyCustomDataType", Resolution.Daily).Symbol; // Get the trailing 5 days of MyCustomDataType data. var history = History<MyCustomDataType>(datasetSymbol, 5, Resolution.Daily); } }
class CustomSecurityHistoryAlgorithm(QCAlgorithm): def initialize(self) -> None: self.set_start_date(2014, 7, 10) # Add a custom dataset and save a reference to it's Symbol. dataset_symbol = self.add_data(MyCustomDataType, "MyCustomDataType", Resolution.DAILY).symbol # Get the trailing 5 days of MyCustomDataType data in DataFrame format. history = self.history(dataset_symbol, 5, Resolution.DAILY)
If you request a DataFrame, LEAN unpacks the data from Slice
objects to populate the DataFrame.
If you intend to use the data in the DataFrame to create customDatasetClass
objects, request that the history request returns the data type you need.
Otherwise, LEAN will consume computational resources populating the DataFrame.
To get a list of dataset objects instead of a DataFrame, call the history[customDatasetClass]
method.
# Get the trailing 5 days of MyCustomDataType data for an asset in MyCustomDataType format. history = self.history[MyCustomDataType](dataset_symbol, 5, Resolution.DAILY)
If the dataset provides multiple entries per time step, in the get_source
GetSource
method of your custom data class, return a SubscriptionDataSource
that uses FileFormat.UNFOLDING_COLLECTION
FileFormat.UnfoldingCollection
.
To get the historical data of this custom data type in a DataFrame, set the flatten
argument to True
.
history = self.history(dataset_symbol, 1, Resolution.DAILY, flatten=True)
Universes
To get historical data for a custom data universe, call the History
method with the Universe
object.
For an example definition of a custom data universe class, see the CSV Format Example.
To get historical data for a custom data universe, call the history
method with the Universe
object.
For an example definition of a custom data universe class, see the CSV Format Example.
public class CustomDataUniverseHistoryAlgorithm : QCAlgorithm { public override void Initialize() { SetStartDate(2017, 7, 9); // Add a universe from a custom data source and save a reference to it. var universe = AddUniverse<StockDataSource>( "myStockDataSource", Resolution.Daily, data => data.Select(x => x.Symbol) ); // Get the historical universe data over the last 5 days. var history = History(universe, TimeSpan.FromDays(5)).Cast<StockDataSource>().ToList(); // Iterate through each day in the universe history and count the number of constituents. foreach (var stockData in history) { var t = stockData.Time; var size = stockData.Symbols.Count; } } }
class CustomDataUniverseHistoryAlgorithm(QCAlgorithm): def initialize(self) -> None: self.set_start_date(2017, 7, 9) # Add a universe from a custom data source and save a reference to it. universe = self.add_universe( StockDataSource, "my-stock-data-source", Resolution.DAILY, lambda data: [x.symbol for x in data] ) # Get the historical universe data over the last 5 days in DataFrame format. history = self.history(universe, timedelta(5))
# Count the number of assets in the universe each day. universe_size_by_day = history.apply(lambda row: len(row['symbols']), axis=1)
time 2017-07-05 5 2017-07-06 5 2017-07-07 5 2017-07-08 5 2017-07-09 5 Name: symbols, dtype: int64
Missing Data Points
History requests for a trailing number of data samples return data based on the market hours of assets. The default market hours for custom securities is to be always open. Therefore, history requests for a trailing number of data samples may return fewer samples than you expect. To set the market hours of the dataset, see Market Hours.