Custom Universes

CSV Format Example

Introduction

This page explains how to import custom data for universe selection sourced in CSV format.

Data Format

You must create a file with data in CSV format. Ensure the data in the file is in chronological order.

20170704,SPY,QQQ,FB,AAPL,IWM
20170706,QQQ,AAPL,IWM,FB,GOOGL
20170707,IWM,AAPL,FB,BAC,GOOGL
...
20170729,SPY,QQQ,FB,AAPL,IWM
20170801,QQQ,FB,AAPL,IWM,GOOGL
20170802,QQQ,IWM,FB,BAC,GOOGL

Define Custom Types

To define a custom data type, inherit the BaseDataPythonData class and override the GetSource and Reader methods.

public class StockDataSource : BaseData
{
    public List<string> Symbols { get; set; } = new();
    public override DateTime EndTime => Time.AddDays(1);

    public override SubscriptionDataSource GetSource(
        SubscriptionDataConfig config,
        DateTime date,
        bool isLiveMode)
    {
        if (!isLiveMode)
        {
            return new SubscriptionDataSource("<CustomUniverseKey>", SubscriptionTransportMedium.ObjectStore, FileFormat.Csv);
        }
        return new SubscriptionDataSource("https://raw.githubusercontent.com/QuantConnect/Documentation/master/Resources/datasets/custom-data/csv-universe-example.csv", SubscriptionTransportMedium.RemoteFile, FileFormat.Csv);
    }

    public override BaseData Reader(
        SubscriptionDataConfig config,
        string line,
        DateTime date,
        bool isLiveMode)
    {
        var stocks = new StockDataSource();

        try
        {
            var csv = line.Split(',');
            stocks.Time = DateTime.ParseExact(csv[0], "yyyyMMdd", null);
            stocks.Symbols.AddRange(csv.Skip(1));
        }
        catch { return null; }

        return stocks;
    }
}
class StockDataSource(PythonData):

    def get_source(self,
         config: SubscriptionDataConfig,
         date: datetime,
         is_live: bool) -> SubscriptionDataSource:
        if not is_live:
            return SubscriptionDataSource("<custom_universe_key>", SubscriptionTransportMedium.OBJECT_STORE, FileFormat.CSV)
        return SubscriptionDataSource("https://raw.githubusercontent.com/QuantConnect/Documentation/master/Resources/datasets/custom-data/csv-universe-example.csv", SubscriptionTransportMedium.REMOTE_FILE, FileFormat.CSV)

    def reader(self,
         config: SubscriptionDataConfig,
         line: str,
         date: datetime,
         is_live: bool) -> BaseData:

        if not (line.strip() and line[0].isdigit()): return None
        
        stocks = StockDataSource()
        stocks.symbol = config.symbol

        try:
            csv = line.split(',')
            stocks.time = datetime.strptime(csv[0], "%Y%m%d")
            stocks.end_time = stocks.time + timedelta(days=1)
            stocks["Symbols"] = csv[1:]

        except ValueError:
            # Do nothing
            return None

        return stocks

Initialize Universe

To perform a universe selection with custom data, in the Initializeinitialize method, call the AddUniverseadd_universe method.

public class MyAlgorithm : QCAlgorithm
{
    public override void Initialize()
    {
        AddUniverse<StockDataSource>("myStockDataSource", Resolution.Daily, FilterFunction);
    }
}
class MyAlgorithm(QCAlgorithm): 
    def initialize(self) -> None:
        self.add_universe(StockDataSource, "my-stock-data-source", Resolution.DAILY, self._filter_function)
    

Receive Custom Data

As your data reader reads your custom data file, LEAN adds the data points into a List[StockDataSource])IEnumerable<StockDataSource> object it passes to your algorithm's filter function. Your filter function needs to return a list of Symbol or strstring object. LEAN automatically subscribes to these new assets and adds them to your algorithm.

public class MyAlgorithm : QCAlgorithm
{
    private IEnumerable<string> FilterFunction(IEnumerable<StockDataSource> stockDataSource)
    {
        return stockDataSource.SelectMany(x => x.Symbols);
    }
}
class MyAlgorithm(QCAlgorithm):
    def _filter_function(self, data: List[StockDataSource]) -> List[str]:
        symbols = []
        for item in data:
            for symbol in item["Symbols"]:
                symbols.append(symbol)
        return symbols
    

If you add custom properties to your data object in the Readerreader method, LEAN adds them as members to the data object in your filter method. To ensure the property names you add in the Readerreader method follow the convention of member names, LEAN applies the following changes to the property names you provide in the Readerreader method:

  1. - and . characters are replaced with whitespace.
  2. The first letter is capitalized.
  3. Whitespace characters are removed.

For example, if you set a property name in the Readerreader method to ['some-property.name'], you can access it in your filter method through the Somepropertyname member of your data object.

You can also see our Videos. You can also get in touch with us via Discord.

Did you find this page helpful?

Contribute to the documentation: