book
Checkout our new book! Hands on AI Trading with Python, QuantConnect, and AWS Learn More arrow

Datasets

Alternative Data

Introduction

This page explains how to request, manipulate, and visualize historical alternative data. This tutorial uses the VIX Daily Price dataset from the CBOE as the example dataset.

Create Subscriptions

Follow these steps to subscribe to an alternative dataset from the Dataset Market:

  1. Create a QuantBook.
  2. Select Language:
    qb = QuantBook()
  3. Call the add_data method with the dataset class, a ticker, and a resolution and then save a reference to the alternative data Symbol.
  4. Select Language:
    vix = qb.add_data(CBOE, "VIX", Resolution.DAILY).symbol
    v3m = qb.add_data(CBOE, "VIX3M", Resolution.DAILY).symbol

    To view the arguments that the add_data method accepts for each dataset, see the dataset listing.

    If you don't pass a resolution argument, the default resolution of the dataset is used by default. To view the supported resolutions and the default resolution of each dataset, see the dataset listing.

Get Historical Data

You need a subscription before you can request historical data for a dataset. On the time dimension, you can request an amount of historical data based on a trailing number of bars, a trailing period of time, or a defined period of time. On the dataset dimension, you can request historical data for a single dataset subscription, a subset of the dataset subscriptions you created in your notebook, or all of the dataset subscriptions in your notebook.

Trailing Number of Bars

To get historical data for a number of trailing bars, call the history method with the Symbol object(s) and an integer.

Select Language:
# DataFrame
single_history_df = qb.History(vix, 10)
subset_history_df = qb.History([vix, v3m], 10)
all_history_df = qb.History(qb.Securities.Keys, 10)

# Slice objects
all_history_slice = qb.History(10)

# CBOE objects
single_history_data_objects = qb.History[CBOE](vix, 10)
subset_history_data_objects = qb.History[CBOE]([vix, v3m], 10)
all_history_data_objects = qb.History[CBOE](qb.Securities.Keys, 10)

The preceding calls return the most recent bars, excluding periods of time when the exchange was closed.

Trailing Period of Time

To get historical data for a trailing period of time, call the history method with the Symbol object(s) and a timedelta.

Select Language:
# DataFrame
single_history_df = qb.History(vix, timedelta(days=3))
subset_history_df = qb.History([vix, v3m], timedelta(days=3))
all_history_df = qb.History(qb.Securities.Keys, timedelta(days=3))

# Slice objects
all_history_slice = qb.History(timedelta(days=3))

# CBOE objects
single_history_data_objects = qb.History[CBOE](vix, timedelta(days=3))
subset_history_data_objects = qb.History[CBOE]([vix, v3m], timedelta(days=3))
all_history_data_objects = qb.History[CBOE](qb.Securities.Keys, timedelta(days=3))

The preceding calls return the most recent bars or ticks, excluding periods of time when the exchange was closed.

Defined Period of Time

To get historical data for a specific period of time, call the history method with the Symbol object(s), a start datetime, and an end datetime. The start and end times you provide are based in the notebook time zone.

Select Language:
start_time = datetime(2021, 1, 1)
end_time = datetime(2021, 3, 1)

# DataFrame
single_history_df = qb.History(vix, start_time, end_time)
subset_history_df = qb.History([vix, v3m], start_time, end_time)
all_history_df = qb.History(qb.Securities.Keys, start_time, end_time)

# Slice objects
all_history_slice = qb.History(start_time, end_time)

# CBOE objects
single_history_data_objects = qb.History[CBOE](vix, start_time, end_time)
subset_history_data_objects = qb.History[CBOE]([vix, v3m], start_time, end_time)
all_history_data_objects = qb.History[CBOE](qb.Securities.Keys, start_time, end_time)

The preceding calls return the bars or ticks that have a timestamp within the defined period of time.

If you do not pass a resolution to the history method, the history method uses the resolution that the add_data method used when you created the subscription.

Wrangle Data

You need some historical data to perform wrangling operations. The process to manipulate the historical data depends on its data type. To display pandas objects, run a cell in a notebook with the pandas object as the last line. To display other data formats, call the print method.

DataFrame Objects

If the history method returns a DataFrame, the first level of the DataFrame index is the encoded dataset Symbol and the second level is the end_time of the data sample. The columns of the DataFrame are the data properties.

DataFrame of two tickers

To select the historical data of a single dataset, index the loc property of the DataFrame with the dataset Symbol.

all_history_df.loc[vix]  # or all_history_df.loc['VIX']
DataFrame of one dataset

To select a column of the DataFrame, index it with the column name.

all_history_df.loc[vix]['close']
Series of close values

If you request historical data for multiple tickers, you can transform the DataFrame so that it's a time series of close values for all of the tickers. To transform the DataFrame, select the column you want to display for each ticker and then call the unstack method.

all_history_df['close'].unstack(level=0)

The DataFrame is transformed so that the column indices are the Symbol of each ticker and each row contains the close value.

DataFrame of one dataset
df["VIX close"]

Slice Objects

If the history method returns Slice objects, iterate through the Slice objects to get each one. The Slice objects may not have data for all of your dataset subscriptions. To avoid issues, check if the Slice contains data for your ticker before you index it with the dataset Symbol.

Plot Data

You need some historical alternative data to produce plots. You can use many of the supported plotting libraries to visualize data in various formats. For example, you can plot candlestick and line charts.

Candlestick Chart

You can only create candlestick charts for alternative datasets that have open, high, low, and close properties.

Follow these steps to plot candlestick charts:

  1. Get some historical data.
  2. Select Language:
    history = qb.history(vix, datetime(2021, 1, 1), datetime(2021, 2, 1)).loc[vix]
  3. Import the plotly library.
  4. Select Language:
    import plotly.graph_objects as go
  5. Create a Candlestick.
  6. Select Language:
    candlestick = go.Candlestick(x=history.index,
                                 open=history['open'],
                                 high=history['high'],
                                 low=history['low'],
                                 close=history['close'])
  7. Create a Layout.
  8. Select Language:
    layout = go.Layout(title=go.layout.Title(text='VIX from CBOE OHLC'),
                       xaxis_title='Date',
                       yaxis_title='Price',
                       xaxis_rangeslider_visible=False)
  9. Create a Figure.
  10. fig = go.Figure(data=[candlestick], layout=layout)
  11. Show the Figure.
  12. Select Language:
    fig.show()

    Candlestick charts display the open, high, low, and close prices of the alternative data.

Candlestick plot of VIX from CBOE dataset OHLC

Line Chart

Follow these steps to plot line charts using built-in methods:

  1. Get some historical data.
  2. Select Language:
    history = qb.history([vix, v3m], datetime(2021, 1, 1), datetime(2021, 2, 1))
  3. Select the data to plot.
  4. values = history['close'].unstack(0)
  5. Call the plot method on the pandas object.
  6. values.plot(title = 'Close', figsize=(15, 10))
  7. Show the plot.
  8. Select Language:
    plt.show()

    Line charts display the value of the property you selected in a time series.

Line chart of VIX closes from CBOE dataset

Examples

The following examples demonstrate some common practices for applying the Equity Fundamental Data dataset.

Example 1: PE Ratio Line Chart

The following example studies the trend of 10-year yield curve using a line chart.

Select Language:
# Create a QuantBook
qb = QuantBook()

# Request 10-year US Yield Curve historical data.
symbol = qb.add_data(USTreasuryYieldCurveRate, "USTYCR").symbol
history = qb.history(
    USTreasuryYieldCurveRate,
    symbol, 
    start=qb.time - timedelta(days=365), 
    end=qb.time, 
    resolution=Resolution.DAILY,
    flatten=True
)
    
# Select the 10-year US Yield Rate to study.
pe_ratio = history.droplevel([0]).tenyear
# Plot the 10-year US Yield Rate line chart.
pe_ratio.plot(title=f"10-year US Yield Rate by Time of {symbol}", ylabel="Yield Rate")

You can also see our Videos. You can also get in touch with us via Discord.

Did you find this page helpful?

Contribute to the documentation: