Abstract
This post implements a strategy that trades stocks with low expected idiosyncratic skewness based on a paper by Boyer, Mitton and Vorkink (2009, hereafter BMV) published in The Review of Financial Studies. Our implementation narrows down our initial universe to liquid assets by selecting 200 stocks based on daily trading volume, price and whether the stock has fundamental data in our data library. We calculate the expected idiosyncratic skewness at the end of each month and sort our universe based on the calculated skewness. This implementation will long the bottom 5%, hold for the next month, and rebalance the portfolio monthly. The Sharpe ratio is 1.03 relative to S&P 500 (SPY) Sharpe ratio of 1.00 during the period of July 1, 2009 to July 30, 2019.
Data Description
To execute our algorithm, we will use daily data from Kenneth French's Data Library that captures the Fama-French three factors for the period July 1, 2009 to June 30, 2019. The raw data is delivered in a zip file which is not directly importable into LEAN. We need to unzip the file and upload the CSV to a Github repository. All other data used for this algorithm, including stock price, volume, and market capitalization, are from QuantConnect's Data Library. In the original paper, BSV also includes firm-specific variables like momentum, turnover, and dummies of properties including Nasdaq-listed stocks, small-size, medium-size, industries. We can refer to this online technical appendix for descriptions of these variables.
Method
We can develop a model of estimated expected idiosyncratic skewness using Fama-French three factors. Lower expected idiosyncratic skewness will predict a higher alpha. We will let the investment horizon over which investors are hoping to experience an extremely positive outcome be 1 month.
Step 1: Getting Fama-French three-factor regression residuals
We regress stock excess return to Fama-French three factors: (1) market risk, (2) the outperformance of small versus big companies, and (3) the outperformance of high book/market versus small book/market companies. The regression coefficients are estimated using daily data for trading days in the current month. We obtain the regression residuals for each stock and all trading days.
Step 2: Estimating historical idiosyncratic moments
We calculate the historical estimates of idiosyncratic volatility and skewness as the monthly sample volatility and skewness of daily regression residuals for each stock.
Step 3: Estimating expected idiosyncratic skewness
We need measures of expected skewness over a horizon of 1 month for firm i at the end of month t, rather than measures of historical skewness as calculated above. To model investor perceptions of expected skewness in a feasible manner, we first estimate cross-sectional regression of current idiosyncratic skewness on idiosyncratic skewness and volatility from the last period. We then use the estimated regression parameters from this regression, along with information observable at the end of each month t, to estimate expected skewness for each firm using the same equation.
Step 4: Generating trading signals
At the end of each month, we use the results of equation above to sort stocks by expected idiosyncratic skewness. We construct our universe using the lowest 5% of expected skewness, and long our assets to construct a value-weighted portfolio.
Conclusion and Future Work
Before the BMV paper was published in 2009, a number of theories on the pricing premium for stocks with idiosyncratic skewness existed, but lacked supporting empirical evidence of the relationship between idiosyncratic skewness and returns. BMV fills this void by estimating a model of predicted skewness and using predicted skewness to explain the cross-section of returns. The paper finds that lagged idiosyncratic volatility is a stronger predictor of skewness than lagged idiosyncratic skewness.
In this implementation, we rely on idiosyncratic volatility and skewness to predict idiosyncratic skewness. Interested users can build from this implementation by trying the following extensions:
- Including a number of firm-specific variables to improve predictive power for expected idiosyncratic skewness;
- Using different investment horizons such as 3 months, 6 months, 1 year;
- Adding more lags in the time-series regression for both expected and historical idiosyncratic skewness.
Note: For additional information, please check out this tutorial page. Feel free to leave any questions or suggestion here about our implementation. Also, try out the extensions! We'd be happy to hear that you improve the strategy Sharpe!
Michael Boguslaw
When I try to run this code, I get the following error:
Runtime Error: ValueError : 'time' is both an index level and a column label, which is ambiguous.
at CoarseSelectionAndSkewnessSorting in main.py:line 62
:: symbol_and_skew = self.CalculateExpectedSkewness(high_volume_stocks)
at CalculateExpectedSkewness in main.py:line 125
ValueError : 'time' is both an index level and a column label, which is ambiguous. (Open Stacktrace)
Rahul Chowdhury
Hey Michael,
We can resolve this issue by renaming the index levels so that they are different from the column labels. Before we merge, let's rename 'time' to 'Time'
daily_returns.index.names = ['Time'] self.fama_french_factors_per_day.index.names = ['Time'] daily_returns = daily_returns.merge(self.fama_french_factors_per_day, left_on = 'Time', right_on = 'Time')
Best
Rahul
Phil Maier
Hey guys,
I was wondering, since we regress idiosyncratic skewness on lagged idiosyncratic skewness and lagged idiosyncratic volatility from last month, how can this regression actually produce intercepts or coefficients? We only have one observation in our specification.
Derek Melchin
Hi Phil,
When building the `X` DataFrame to fit the regression model, there is a row for each symbol in the universe for the given month. Since the algorithm limits the universe size to 200, we have 200 observations in the DataFrame. See the attached backtest's logs for reference.
Best,
Derek Melchin
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
Xin Wei
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!