Stock Selection Strategy Based On Fundamental Factors

Abstract

In recent years, factor investing gained significant popularity among global institutional investors. In this tutorial, we first developed a factor selection model to test if factors have the ability to differentiate potential winners and losers in the stock market. Then we use those preselected factors to implement the factor ranking stock selection algorithm based on Factor Based Stock Selection Model for Turkish Equities, 2015, Ayhan Yüksel.

Factor Selection

QuantConnect provides Morningstar fundamentals data for US Equities. Valuation Ratios is daily data. For others like operation ratios and financial statements data are available for multiple periods depending on the property. To view the fundamental factors that are availalbe, see Data Point Attributes.

The algorithm is designed to test the significance of one factor each time.

def initialize(self):
    self.set_start_date(2005,01,01)  #Set Start Date
    self.set_end_date(2012,03,01)    #Set End Date
    self.set_cash(50000)            #Set Strategy Cash
    self.universe_settings.resolution = Resolution.DAILY
    self.add_universe(self.coarse_selection_function, self.fine_selection_function)
    self.add_equity("SPY") # add benchmark
    self.num_of_course_symbols = 200
    self.num_of_portfolio = 5
    self._changes = None
    self.flag1 = 1  # variable to control the monthly rebalance of coarse and fine selection function
    self.flag2 = 0  # variable to control the monthly rebalance of OnData function
    self.flag3 = 0  # variable to record the number of rebalancing times
        # store the monthly returns of different portfolios in a dataframe
    self.df_return = pd.DataFrame(index = range(self.num_of_portfolio+1))
        # schedule an event to fire at the first trading day of SPY
    self.schedule.on(self.date_rules.month_start("SPY"), self.time_rules.after_market_open("SPY"), Action(self.rebalancing))
+ Expand
- Collapse

Step 1: Ranking the stocks by factor values

First, we sort the stocks by daily dollar volume and take the top stocks with the highest dollar volumes as our candidates. There is a convenient way using our universe selection API. Universes are refreshed every day by default. Here we use Scheduled events API to trigger code to run at the first trading day each month and use three flag variables to control the rebalancing of CoarseSelection, FineSelection and Ondata functions.

Coarse universe selection is the built-in universe data provided by QuantConnect which allows you to filter the universe of over 16,000 symbols to perform rough filtering before your algorithm. Because coarse selection function takes all the Equities including ETFs which have no fundamental data into account, we need the property x.has_fundamental_data to exclude them from our candidate stocks pool.

# sort the data by daily dollar volume and take the top entries
def coarse_selection_function(self, coarse):
    if self.flag1:
        coarse_with_fundamental = [x for x in coarse if x.has_fundamental_data]
        sorted_by_volume = sorted(coarse_with_fundamental, key=lambda x: x.dollar_volume, reverse=True)
        top = sorted_by_volume[:self.num_of_course_symbols]
                return [i.symbol for i in top]
    else:
        return []

We extract the factor values of candidate stocks at the beginning of each month and sort the stocks in ascending order of their factor values. Here we use 12-months' total risk-based capital data

x.financial_statements.total_risk_based_capital.twelve_months

as an example. It is the sum of Tier 1 and Tier 2 Capital. x.symbol.value can give the string symbol of selected stock x. Then we save those sorted symbols as self.symbol.

def fine_selection_function(self, fine):
    if self.flag1:
        self.flag1 = 0
        self.flag2 = 1
        # filter the fine by deleting equities wit zero factor value
        filtered_fine = [x for x in fine if x.financial_statements.total_risk_based_capital.twelve_months != 0 ]
        # sort the fine by reverse order of factor value
        sorted_fine = sorted(filtered_fine, key=lambda x: x.financial_statements.total_risk_based_capital.twelve_months, reverse=True)
        self.symbol = [str(x.symbol.value) for x in sorted_fine]
        # factor_value = [x.valuation_ratios.pe_ratio for x in sorted_fine]
        self.flag3 = self.flag3 + 1
        return []
    else:
        return []

Step 2: Compute the monthly return of portfolios

At the end of each month, we extract the one-month history close prices of each stock and compute the monthly returns.

sorted_symbol = self.symbol
self.add_equity("SPY") # add benchmark
for x in sorted_symbol:
    self.add_equity(x)
history = self.history(20,Resolution.DAILY)
monthly_return =[]
new_symbol_list =[]
for j in range(len(sorted_symbol)):
    try:
        daily_price = []
        for slice in history:
            bar = slice[sorted_symbol[j]]
            daily_price.append(float(bar.close))
        new_symbol_list.append(sorted_symbol[j])
        monthly_return.append(daily_price[-1] / daily_price[0] - 1)
    except:
        self.log("No history data for " + str(sorted_symbol[j]))
        del daily_price
# the length of monthly_return list should be divisible by the number of portfolios
monthly_return = monthly_return[:int(math.floor(len(monthly_return) / self.num_of_portfolio) * self.num_of_portfolio)]
+ Expand
- Collapse

We divide the stocks into 5 portfolios and compute the average monthly returns of each portfolio. Then we add the monthly return of benchmark "SPY" at the last line of the data frame df_return.

reshape_return = np.reshape(monthly_return, (self.num_of_portfolio, len(monthly_return)/self.num_of_portfolio))
# calculate the average return of different portfolios
port_avg_return = np.mean(reshape_return,axis=1).tolist()
# add return of "SPY" as the benchmark  to the end of the return list
benchmark_syl = self.add_equity("SPY").symbol
history_benchmark = self.history(20,Resolution.DAILY)
benchmark_daily_price = [float(slice[benchmark_syl].close) for slice in history_benchmark]
benchmark_monthly_return = (benchmark_daily_price[-1]/benchmark_daily_price[0]) - 1
port_avg_return.append(benchmark_monthly_return)
self.df_return[str(self.flag3)] = port_avg_return

Step 3: Generate the metrics to test the factor significance

After getting the monthly returns of portfolios and the benchmark, we compute the average annual return and excess return over benchmark of each portfolio across the whole backtesting period. Then we generate three metrics to judge the significance of each factor.

The first metrics is the correlation between the portfolio' returns and their rank. The absolute value of the correlation coefficient should larger than 0.8.
If the return of the rank first portfolio larger than the portfolio at the bottom of the return rankings, we define it the win portfolio and the loss portfolio and vice versa. The win probability is the probability that the win portfolio return outperform the benchmark return. The loss probability is the probability that the loss portfolio return underperform the benchmark. If the factor is significant, both loss and win probability should greater than 0.4.
The excess return of win portfolio should be greater than 0.25, while the excess return of loss portfolio should be lower than 0.05.

def calculate_criteria(self,df_port_return):
    total_return = (df_port_return + 1).T.cumprod().iloc[-1,:] - 1
    annual_return = (total_return+1)**(1./6)-1
    excess_return = annual_return - np.array(annual_return)[-1]
    correlation = annual_return[0:5].corr(pd.Series([5,4,3,2,1],index = annual_return[0:5].index))
    # higher factor with higher return
    if np.array(total_return)[0] > np.array(total_return)[-2]:
        loss_excess = df_port_return.iloc[-2,:] - df_port_return.iloc[-1,:]
        win_excess = df_port_return.iloc[0,:] - df_port_return.iloc[-1,:]
        loss_prob = loss_excess[loss_excess<0].count()/float(len(loss_excess)) win_prob = win_excess[win_excess>0].count()/float(len(win_excess))
        win_port_excess_return = np.array(excess_return)[0]
        loss_port_excess_return = np.array(excess_return)[-2]
    # higher factor with lower return
    else:
        loss_excess = df_port_return.iloc[0,:] - df_port_return.iloc[-1,:]
        win_excess = df_port_return.iloc[-2,:] - df_port_return.iloc[-1,:]
        loss_prob = loss_excess[loss_excess<0].count()/float(len(loss_excess)) win_prob = win_excess[win_excess>0].count()/float(len(win_excess))
        win_port_excess_return = np.array(excess_return)[-2]
        loss_port_excess_return = np.array(excess_return)[0]
    test_result = {}
    test_result["correelation"]=correlation
    test_result["win probality"]=win_prob
    test_result["loss probality"]=loss_prob
    test_result["win portfolio excess return"]=win_port_excess_return
    test_result["loss portfolio excess return"]=loss_port_excess_return
    return test_result
+ Expand
- Collapse

The follow tables shows the factor significance testing result:

Factor	FCFYield	BuyBackYield	PriceChange1M	TrailingDividendYield	EVToEBITDA	RevenueGrowth	BookValuePerShare
The correlation	-0.936	-0.987	0.918	-0.981	0.939	0.89	-0.92
Win Probability	0.630	0.639	1	0.667	0.722	0.69	0.69
Loss probability	0.426	0.472	1	0.518	0.472	0.42	0.40
Excess Return(Win)	0.324	0.212	0.303	0.225	0.414	0.23	0.27
Excess Return(Loss)	0.060	0.037	-1.67	0.043	0.042	0.07	0.06

We choose 4 factors: FCFYield, PriceChange1M, BookValuePerShare and RevenueGrowth.

Stock Selection

Next we will select the stocks.

Step 1: Rank the stocks by factor values

First, we remove the stocks without fundamental data or have zero factor value. For each pre-selected factor, we rank the stocks by those factor values. The order is descending if the factor correlation is negative, it is ascending if the factor correlation is positive.

Step 2: Calculate equally weighted composite factor scores

The second step is using different selected factor variables to calculate an equally weighted composite factor score for each stock.

First, according to the factor order, we place our universe of stocks into 5 distinct quintile portfolios, named P1, P2, P3, P4 and P5. The ranking of portfolios sets out the preference of the factor model, i.e. the first portfolio (P1) corresponds to the “most preferred” stocks, while the fifth (P5) corresponds to the “least preferred” stocks. Suppose there are $n$ stocks in total. Then the stocks fall into the first rank portfolio will have score $p$ , the stocks fall into the second rank portfolio will get score $p-1$ and so on. Then we can get a score for every stock. We did the same calculation for each factor.
Second, we calculate a “Composite Factor Score” by combining the six-factor scores and using an equal weighting scheme. Then we get composite factor score for each stock.
Third, we then rank the stocks in our universe according to their Composite Factor Scores and choose the highest ranked 20 stocks to construct our portfolios at the beginning of each month.
At the end of each month, we repeat the above steps to construct the new portfolio and adjust the holding stocks.

Reference

Factor Based Stock Selection Model for Turkish Equities, 2015, Ayhan Yüksel Online Copy

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.

Upvote

Ashutosh

QuantConnect | January 2024

Hello Sebastian, Could you please provide more details regarding the issue you're experiencing with the research notebook? I would appreciate a more in-depth explanation of the problems you're encountering. Additionally, if you could include screenshots, it would greatly assist in understanding and addressing the issues.

Upvote

1 person upvoted this

Platform

Stock Selection Strategy Based On Fundamental Factors

Radically Open-Source Algorithmic Trading Engine

Join Our Discord Channel

Quarterly Open-Source Trading Competition

Draft Discussions

Bookmarked Discussions

SEARCH DISCUSSIONS

TOP 5 Research Publications

VOTE FOR UPCOMING FEATURES

JOIN OUR Research MAILING LIST

Abstract

Factor Selection

Step 1: Ranking the stocks by factor values

Step 2: Compute the monthly return of portfolios

Step 3: Generate the metrics to test the factor significance

Stock Selection

Step 1: Rank the stocks by factor values

Step 2: Calculate equally weighted composite factor scores

Reference

IN THIS RESEARCH

PARTICIPANTS

Actions

Join QuantConnect for Free

Platform

SIGN IN

Stock Selection Strategy Based On Fundamental Factors

Radically Open-Source Algorithmic Trading Engine

Join Our Discord Channel

Quarterly Open-Source Trading Competition

Draft Discussions

Bookmarked Discussions

SEARCH DISCUSSIONS

TOP 5 Research Publications

VOTE FOR UPCOMING FEATURES

JOIN OUR Research MAILING LIST

Abstract

Factor Selection

Step 1: Ranking the stocks by factor values

Step 2: Compute the monthly return of portfolios

Step 3: Generate the metrics to test the factor significance

Stock Selection

Step 1: Rank the stocks by factor values

Step 2: Calculate equally weighted composite factor scores

Reference

IN THIS RESEARCH

PARTICIPANTS

SHARE RESEARCH

SHARE DISCUSSION

SHARE ARTICLE

SHARE

Actions

Join QuantConnect for Free