book
Checkout our new book! Hands on AI Trading with Python, QuantConnect, and AWS Learn More arrow

Datasets

Custom Data

Introduction

This page explains how to request, manipulate, and visualize historical user-defined custom data.

Define Custom Data

You must format the data file into chronological order before you define the custom data class.

To define a custom data class, extend the PythonData class and override the GetSource and Reader methods.

Select Language:
class Nifty(PythonData):
    '''NIFTY Custom Data Class'''
    def get_source(self, config: SubscriptionDataConfig, date: datetime, is_live_mode: bool) -> SubscriptionDataSource:
        url = "http://cdn.quantconnect.com.s3.us-east-1.amazonaws.com/uploads/CNXNIFTY.csv"
        return SubscriptionDataSource(url, SubscriptionTransportMedium.REMOTE_FILE)

    def reader(self, config: SubscriptionDataConfig, line: str, date: datetime, is_live_mode: bool) -> BaseData:
        if not (line.strip() and line[0].isdigit()): return None

        # New Nifty object
        index = Nifty()
        index.symbol = config.symbol

        try:
            # Example File Format:
            # Date,       Open       High        Low       Close     Volume      Turnover
            # 2011-09-13  7792.9    7799.9     7722.65    7748.7    116534670    6107.78
            data = line.split(',')
            index.time = datetime.strptime(data[0], "%Y-%m-%d")
            index.end_time = index.time + timedelta(days=1)
            index.value = data[4]
            index["Open"] = float(data[1])
            index["High"] = float(data[2])
            index["Low"] = float(data[3])
            index["Close"] = float(data[4])

        except:
            pass

        return index

Create Subscriptions

You need to define a custom data class before you can subscribe to it.

Follow these steps to subscribe to custom dataset:

  1. Create a QuantBook.
  2. Select Language:
    qb = QuantBook()
  3. Call the add_data method with a ticker and then save a reference to the data Symbol.
  4. Select Language:
    symbol = qb.add_data(Nifty, "NIFTY").symbol

    Custom data has its own resolution, so you don't need to specify it.

Get Historical Data

You need a subscription before you can request historical data for a security. You can request an amount of historical data based on a trailing number of bars, a trailing period of time, or a defined period of time.

Before you request data, call set_start_date method with a datetime to reduce the risk of look-ahead bias.

Select Language:
qb.set_start_date(2014, 7, 29)

If you call the set_start_date method, the date that you pass to the method is the latest date for which your history requests will return data.

Trailing Number of Bars

Call the history method with a symbol, integer, and resolution to request historical data based on the given number of trailing bars and resolution.

Select Language:
history = qb.history(symbol, 10)

This method returns the most recent bars, excluding periods of time when the exchange was closed.

Trailing Period of Time

Call the history method with a symbol, timedelta, and resolution to request historical data based on the given trailing period of time and resolution.

Select Language:
history = qb.history(symbol, timedelta(days=10))

This method returns the most recent bars, excluding periods of time when the exchange was closed.

Defined Period of Time

Call the history method with a symbol, start datetime, end datetime, and resolution to request historical data based on the defined period of time and resolution. The start and end times you provide are based in the notebook time zone.

Select Language:
start_time = datetime(2013, 7, 29)
end_time = datetime(2014, 7, 29)
history = qb.history(symbol, start_time, end_time)

This method returns the bars that are timestamped within the defined period of time.

In all of the cases above, the history method returns a DataFrame with a MultiIndex.

Dataframe of custom NIFTY data from history call

Download Method

To download the data directly from the remote file location instead of using your custom data class, call the download method with the data URL.

Select Language:
content = qb.download("http://cdn.quantconnect.com.s3.us-east-1.amazonaws.com/uploads/CNXNIFTY.csv")

Follow these steps to convert the content to a DataFrame:

  1. Import the StringIO from the io library.
  2. from io import StringIO
  3. Create a StringIO.
  4. data = StringIO(content)
  5. Call the read_csv method.
  6. dataframe = pd.read_csv(data, index_col=0)
Dataframe of custom NIFTY data from io read csv

Wrangle Data

You need some historical data to perform wrangling operations. To display pandas objects, run a cell in a notebook with the pandas object as the last line. To display other data formats, call the print method.

The DataFrame that the history method returns has the following index levels:

  1. Dataset Symbol
  2. The end_time of the data sample

The columns of the DataFrame are the data properties.

Dataframe of custom NIFTY data

To select the data of a single dataset, index the loc property of the DataFrame with the data Symbol.

history.loc[symbol]
Dataframe of custom NIFTY data with symbol selected

To select a column of the DataFrame, index it with the column name.

history.loc[symbol]['close']
Close price series of custom NIFTY data

Plot Data

You need some historical custom data to produce plots. You can use many of the supported plotting libraries to visualize data in various formats. For example, you can plot candlestick and line charts.

Candlestick Chart

Follow these steps to plot candlestick charts:

  1. Get some historical data.
  2. Select Language:
    history = qb.history(Nifty, datetime(2013, 7, 1), datetime(2014, 7, 31)).loc[symbol]
  3. Import the plotly library.
  4. Select Language:
    import plotly.graph_objects as go
  5. Create a Candlestick.
  6. Select Language:
    candlestick = go.Candlestick(x=history.index,
                                 open=history['open'],
                                 high=history['high'],
                                 low=history['low'],
                                 close=history['close'])
  7. Create a Layout.
  8. Select Language:
    layout = go.Layout(title=go.layout.Title(text=f'{symbol} OHLC'),
                       xaxis_title='Date',
                       yaxis_title='Price',
                       xaxis_rangeslider_visible=False)
  9. Create a Figure.
  10. fig = go.Figure(data=[candlestick], layout=layout)
  11. Show the Figure.
  12. Select Language:
    fig.show()

    Candlestick charts display the open, high, low, and close prices of the security.

Candlestick plot of custom NIFTY OHLC

Line Chart

Follow these steps to plot line charts using built-in methods:

  1. Select data to plot.
  2. values = history['value'].unstack(level=0)
  3. Call the plot method on the pandas object.
  4. values.plot(title="Value", figsize=(15, 10))
  5. Show the plot.
  6. Select Language:
    plt.show()

    Line charts display the value of the property you selected in a time series.

Line chart of custom NIFTY value

You can also see our Videos. You can also get in touch with us via Discord.

Did you find this page helpful?

Contribute to the documentation: