QUANTCONNECT COMMUNITY

No Results

Join Our Discord Channel

Join QuantConnect's Discord server for real-time support, where a vibrant community of traders and developers awaits to help you with any of your QuantConnect needs.

Quarterly Open-Source Trading Competition

The Open-Quant League is a quarterly competition between universities and investment clubs for the best-performing strategy. The previous quarter's code is open-sourced, and competitors must adapt to survive.

pending review This research is under review. To publish this research attract three community upvotes.

Draft Discussions

Bookmarked Discussions

Share New Research

Start New Discussion Sign up

SEARCH DISCUSSIONS

TOP 5 Research PUblications

About Quant League

The Open-Quant League is a quarterly competition between universities and investment clubs for the best performing strategy. Previous quarter's code is open-sourced, and competitors must adapt to survive.

competition rules

See the competition code of conduct and rules for participation in prizes.

Read Rules

previous competitions

Browse strategies and organization entries from previous quarter's competitions.

STRATEGY

332,900 Quants.

Become a Quant

VOTE FOR UPCOMING FEATURES

Share your input and vote on our future direction.

LEAN Roadmap

Create an account on QuantConnect for the latest delivered to your inbox.

Problem with data extraction from Quandl

Hello,

I couldn't run this algo as the time serie of the crude oil is shifted by almost 1 year.

Can you help me to fix this problem please

Thank you

Update Backtest

person upvoted this people upvoted this

Wawes23

| |

Accepted Answer

Update Backtest

Notebook

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.

Wawes23

823 ,

import numpy as np
import pandas as pd
import statsmodels.api as sm
import decimal

class PcaStatArbitrageAlgorithm(QCAlgorithm):

    def Initialize(self):
        self.SetStartDate(2006, 1, 1)       # Set Start Date
        self.SetCash(100000)                # Set Strategy Cash

        self.nextRebalance = self.Time      # Initialize next rebalance time
        self.rebalance_days = 7          # Rebalance every 30 days

        self.lookback = 252*3                  # Length(days) of historical data
        self.weights_long,self.weights_short = pd.DataFrame(),pd.DataFrame()       # Pandas data frame (index: symbol) that stores the weight
        self.Portfolio.MarginModel = PatternDayTradingMarginModel()
        self.AGG = self.AddEquity("AGG", Resolution.Daily).Symbol

        self.UniverseSettings.Resolution = Resolution.Daily   # Use hour resolution for speed
        self.AddUniverse(self.CoarseSelectionAndPCA)
        self.AddData(Oil, "oil", Resolution.Daily).Symbol
        self.oil = self.Securities["oil"].Symbol
    

    def CoarseSelectionAndPCA(self, coarse):

        # Before next rebalance time, just remain the current universe
        if self.Time < self.nextRebalance:
            return Universe.Unchanged

        ### Simple coarse selection first

        # Sort the equities in DollarVolume decendingly
        selected = sorted([x for x in coarse if x.HasFundamentalData and x.Price > 5],
                          key=lambda x: x.DollarVolume, reverse=True)

        symbols = [x.Symbol for x in selected[:250]]

        ### After coarse selection, we do PCA and linear regression to get our selected symbols

        # Get historical data of the selected symbols
        history = self.History(symbols, self.lookback, Resolution.Daily).close.unstack(level=0)
        #self.Debug(history.iloc[:,0])

        # Select the desired symbols and their weights for the portfolio from the coarse-selected symbols
        SP500_history = self.History(self.oil, self.lookback, Resolution.Daily).close.unstack(level=0)
        self.Debug(SP500_history.iloc[:2,0])
        
        
        self.weights_long,self.weights_short = self.GetWeights(history,SP500_history)

        # If there is no final selected symbols, return the unchanged universe
        if self.weights_long.empty or self.weights_short.empty :
            return Universe.Unchanged
        
        BTK =  (self.weights_long).append(self.weights_short)

        return [x for x in symbols if str(x) in BTK ]#or self.weights_short.index]


    def GetWeights(self, history,SP500_history):
 
        # Sample data for PCA (smooth it using np.log function)
        sample = history.dropna(axis=1).resample('1W').last().pct_change().dropna()
        #self.Debug(sample.iloc[-10:,1])
        
        market_returns = SP500_history.resample('1W').last().pct_change().dropna()
    
        # Train Ordinary Least Squares linear model for each stock
        OLSmodels = {ticker: sm.OLS(sample[ticker], market_returns).fit() for ticker in sample.columns}
        
        Betas =  pd.DataFrame({ticker: model.params for ticker, model in OLSmodels.items()}).iloc[0,:]

        # Get the stocks far from mean (for mean reversion)
        Betas = Betas[Betas>=-0.10]
        
        selected_long = Betas[Betas <= Betas.quantile(0.10) ].drop(columns = self.SPY)
        
        selected_short = Betas[Betas >= Betas.quantile(0.95) ].drop(columns = self.SPY)

        # Return the weights for each selected stock
        weights_long = selected_long * (1 / len(selected_long))/selected_long
        
        weights_short = selected_short* (-1 / len(selected_short))/selected_short
        
        return weights_long.sort_values() ,weights_short.sort_values() 


    def OnData(self, data):
        '''
        Rebalance every self.rebalance_days
        '''
        ### Do nothing until next rebalance
        if self.Time < self.nextRebalance:
            return

        ### Open positions
        for symbol, weight in self.weights_long.items():
            self.SetHoldings(symbol,0.40*weight)
            
        #for symbol, weight in self.weights_short.items():
        #    self.SetHoldings(symbol,0)
            
        self.SetHoldings('AGG', 0.60)

        ### Update next rebalance time
        self.nextRebalance = self.Time + timedelta(self.rebalance_days)


    def OnSecuritiesChanged(self, changes):
        '''
        Liquidate when the symbols are not in the universe
        '''
        for security in changes.RemovedSecurities:
            if security.Invested:
                self.Liquidate(security.Symbol, 'Removed from Universe')
                
class Oil(PythonData):
    def GetSource(self, config, date, isLiveMode):
         return SubscriptionDataSource("https://www.quandl.com/api/v3/datasets/OPEC/ORB.csv?order=asc", SubscriptionTransportMedium.RemoteFile)
    def Reader(self, config, line, date, isLiveMode):
        oil = Oil()
        oil.Symbol = config.Symbol
        if not (line.strip() and line[0].isdigit()): return None
        try:
            data = line.split(',')
            value = float(data[1])
            value = decimal.Decimal(value)
            if value == 0: return None
            oil.Time = datetime.strptime(data[0], "%Y-%m-%d")
            oil.Value = value
            oil["close"] = float(value)
            return oil
            
        except ValueError:
             return None

There is the attached code :

Shile Wen

63.5k ,

Hi Wawes,

I am unsure what you mean by that the time-series is shifted by one year, as the values seem to be correct for their dates. If you could elaborate on this, that would be greatly appreciated. Furthermore, we can use the built-in support for Quandl data with the PythonQuandl class, which I've shown in the attached backtest.

Best,
Shile Wen

Wawes23

823 ,

import numpy as np
import pandas as pd
import statsmodels.api as sm

class Oilsensibiltiy(QCAlgorithm):

    def Initialize(self):
        
        self.SetStartDate( 2010 , 12, 7 )       # Set Start Date
        self.SetEndDate( 2020 , 10 , 5 )       # Set End Date
        
        self.SetCash(100000)                # Set Strategy Cash

        self.nextRebalance = self.Time      # Initialize next rebalance time
        self.rebalance_days = 30          # Rebalance every 30 days

        self.lookback = 252*3                  # Length(days) of historical data
        self.weights_long= pd.DataFrame()      # Pandas data frame (index: symbol) that stores the weight
        self.Portfolio.MarginModel = PatternDayTradingMarginModel()
        self.AGG = self.AddEquity("AGG", Resolution.Daily).Symbol

        self.UniverseSettings.Resolution = Resolution.Daily   # Use hour resolution for speed
        self.AddUniverse(self.CoarseSelection)
        self.oil = self.AddData(QuandlOil, 'OPEC/ORB', Resolution.Daily).Symbol
        self.selectedequity = 250
    

    def CoarseSelection(self, coarse):

        # Before next rebalance time, just remain the current universe
        if self.Time < self.nextRebalance:
            return Universe.Unchanged

        ### Simple coarse selection first

        # Sort the equities in DollarVolume decendingly
        selected = sorted([x for x in coarse if x.HasFundamentalData and x.Price > 5],
                          key=lambda x: x.DollarVolume, reverse=True)

        symbols = [x.Symbol for x in selected[: self.selectedequity ] ]

        # Get historical data of the selected symbols
        history = self.History(symbols, self.lookback, Resolution.Daily).close.unstack(level=0)
        
        self.Debug(history.index[0])

        # Select the crude oil datas
        crudeoil_history = self.History(QuandlOil, self.oil , self.lookback, Resolution.Daily).droplevel(level=0)
        
        crudeoil_history = crudeoil_history[~crudeoil_history.index.duplicated(keep='last')]
        
        self.Debug(crudeoil_history.index[0])

        self.weights_long = self.GetWeights(history,crudeoil_history)

        # If there is no final selected symbols, return the unchanged universe
        if self.weights_long.empty :
            return Universe.Unchanged
        
        #BTK =  (self.weights_long).append(self.weights_short)

        return [x for x in symbols if str(x) in self.weights_long ]#or self.weights_short.index]


    def GetWeights(self, history,crudeoil_history):
 
        # equity historical pricesprices 
        sample = history.dropna(axis=1).resample('1W').last().pct_change().dropna()
        
        crudeoil_history = crudeoil_history.resample('1W').last().pct_change().dropna()

        # Train Ordinary Least Squares linear model for each stock
        OLSmodels = {ticker: sm.OLS(sample[ticker], crudeoil_history).fit() for ticker in sample.columns}
        
        Betas =  pd.DataFrame({ticker: model.params for ticker, model in OLSmodels.items()}).iloc[0,:]
        
        #We want decorrelated Betas
        Betas = abs(Betas)
        
        selected_long = Betas[Betas <= Betas.quantile(0.10) ].drop(columns = self.oil)
        
        #selected_short = Betas[Betas >= Betas.quantile(0.95) ].drop(columns = self.SPY)

        # Return the weights for each selected stock
        weights_long = selected_long * (1 / len(selected_long))/selected_long
        
        #weights_short = selected_short* (-1 / len(selected_short))/selected_short
        
        return weights_long.sort_values() #,weights_short.sort_values() 


    def OnData(self, data):
        
        ### Do nothing until next rebalance
        if self.Time < self.nextRebalance:
            return

        ### Open positions
        for symbol, weight in self.weights_long.items():
            self.SetHoldings(symbol,0.40*weight)
            
        #for symbol, weight in self.weights_short.items():
        #    self.SetHoldings(symbol,0)
            
        self.SetHoldings('AGG', 0.60)

        ### Update next rebalance time
        self.nextRebalance = self.Time + timedelta(self.rebalance_days)

    def OnSecuritiesChanged(self, changes):
        '''
        Liquidate when the symbols are not in the universe
        '''
        for security in changes.RemovedSecurities:
            if security.Invested:
                self.Liquidate(security.Symbol, 'Removed from Universe')
                
class QuandlOil(PythonQuandl):
    def __init__(self):
        self.ValueColumnName = 'Value'

Thank you for your answer,but this time there is discrepancy between the timeframes of the crude oil and the other equities.

As exemple,for the following code:

The first entry of the crude oil price is registred at the date: 2007-12-07 00:00:00,while for the other equities it is : 2008-11-12 00:00:00

Laurent Crouzet

4.9k ,

The issue is probably that you use datas that do not come from the same data provider, and for which the "history" is not managed the same.

You could try the following method:
self.lookbackEquities = 252*3 # Length(days) of historical data for Equities
self.lookbackOil = 365*3 + 1 # Length(days) of historical data for Oil (3 years + 1 day for leap year)

Then you could replace self.lookback in your code, using instead the self.lookbackEquities and self.lookbackOil respectively each time history is called.

That "far-from-perfect-solution" should work for a few years of backtests... until you have the issue of leap years and/or a discrepancy in the number of bank holidays, which would generate the same error: "Runtime Error: ValueError : The indices for endog and exog are not aligned"

Maybe you could handle this much better by either writing a function that would handle all edge cases, of by using a reverse method (calling the history for Equities, then checking the first date of the data received, then updating the self.lookbackOil value through a difference between current date and the oldest date received, so that dates of both data sources are always the same)

Hope this helps!

Wawes23 INVESTOR

Update Backtest

Notebook

person upvoted this people upvoted this

To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!

Organization

Organization Website

Update Competition

Team

Clone Strategy

Copy this strategy code to your QuantConnect account and deploy it live with your brokerage.

Clone

Previous Ranking

Browse strategies and organization entries from previous quarter's competitions.

Author:

Platform

Radically Open-Source Algorithmic Trading Engine

Join Our Discord Channel

Quarterly Open-Source Trading Competition

Draft Discussions

Bookmarked Discussions

SEARCH DISCUSSIONS

TOP 5 Research PUblications

About Quant League

competition rules

previous competitions

332,900 Quants.

VOTE FOR UPCOMING FEATURES

Problem with data extraction from Quandl

Organization

Team

Clone Strategy

Previous Ranking

IN THIS RESEARCH

PARTICIPANTS

Discussion Awards

Actions

Join QuantConnect for Free

Platform

SIGN IN

Radically Open-Source Algorithmic Trading Engine

Join Our Discord Channel

Quarterly Open-Source Trading Competition

Draft Discussions

Bookmarked Discussions

SEARCH DISCUSSIONS

TOP 5 Research PUblications

About Quant League

competition rules

previous competitions

332,900 Quants.

VOTE FOR UPCOMING FEATURES

Problem with data extraction from Quandl

Organization

Team

Clone Strategy

Previous Ranking

IN THIS RESEARCH

PARTICIPANTS

Discussion Awards

SHARE RESEARCH

SHARE DISCUSSION

SHARE ARTICLE

SHARE

Actions

Join QuantConnect for Free