How do I get a percentile of dollarvolume universe?

QuantConnect Community Discussions

QUANTCONNECT COMMUNITY

Join Our Discord Channel

Join QuantConnect's Discord server for real-time support, where a vibrant community of traders and developers awaits to help you with any of your QuantConnect needs.

Draft Discussions

Bookmarked Discussions

How do I get a percentile of dollarvolume universe?

Last comment by - October 2017

How do I get a percentile of dollarvolume universe?

How to get percentile of dollar volume universe like in Quantopian using LEAN documentation in coarse universe function?

Share New Research

Start New Discussion Sign up

SEARCH DISCUSSIONS

376,300 Quants.

Become a Quant

VOTE FOR UPCOMING FEATURES

Share your input and vote on our future direction.

LEAN Roadmap

JOIN OUR Community MAILING LIST

Create an account on QuantConnect for the latest community delivered to your inbox.

How do I get a percentile of dollarvolume universe?

LukeI | October 2017

I'm looking to get a result similar to quantopian

pipe.set_screen(AverageDollarVolume(window_length=3).percentile_between(85, 100))

I found this in the LEAN documentation: https://www.quantconnect.com/lean/documentation/topic98.html which seems to be what I want but don't know where to implement it in the coarse universe function.

Currently my coarse universe just has

DollarVolume = sorted(selected, key=lambda x: x.DollarVolume, reverse=True)

Author

LukeI

October 2017

Upvote

HanByul P INVESTOR

October 2017

Lukel, Please take a look at Yan Xiaowei's codes in his backtest. See here. His backtest generates error messages but at least you can take a look at it and get how to filter your selection with some factors using CoarseSelectionFunction. Hope this would help. Thanks :)

Upvote

Dan Whitnable INVESTOR

October 2017

I'm a fan of pandas and dataframes. So, here's another approach. Coming from Quantopian this was always fast and easy. There's debate on that here at QC. Being C# under the hood may make that approach slower. Anyway, here is my implememntation:

# Use pandas methods to select the assets we want
# First find the values of dollar_volume at our desired bounds
lower_percent = data_df.dollar_volume.quantile(.85)
upper_percent = data_df.dollar_volume.quantile(1.0)
        
# Now simply query using those values
# Filter for has_fundamentals to remove ETFs
my_universe = (data_df.query('has_fundamentals & (dollar_volume >= @lower_percent) & (dollar_volume <= @upper_percent)'))

Just one line of code once you find the upper and lower percents. See the attached backtest. Note the logs to see how many stocks it's filtering and the cut-offs.

One issue I have with QC is the data. I'm not entirely convinced it's reliable. For example. In the attached backtest I use a 'has_fundamentals' check to exclude ETFs. It excludes ETFs, from what I see, but in this case also excludes AAPL. Ouch. I haven't looked closely into the issue but... beware. (you can check this by deleting the 'has_fundamentals' check from the query. AAPL magically appears in the results. BTW it also takes a lot longer because 900+ stocks are returned vs a bit more than 100).

Upvote

Jared Broad INVESTOR

QuantConnect | October 2017

Will look into that tomorrow Dan. 99% of the time its a misunderstanding on what things mean, or poorly named variables =).

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.

Upvote

LukeI INVESTOR

October 2017

Dan,

Thank you so much! That was just what I was looking for.

This is the code I ended up with:

def CoarseSelectionFunction(self, coarse):
        # First narrow the original coarse universe down to a managable level.    
        init_select = [stock for stock in coarse if (stock.HasFundamentalData) 
                    and (float(stock.Price) >= self.MyLeastPrice)
                    and (float(stock.Price) <= self.MyMostPrice)]
        
        # second convert the initial selection to a pandas dataframe.
        stock_symbols = [stock.Symbol for stock in init_select]
                    
        stock_data = [(stock.DollarVolume,) for stock in init_select]
                
        column_names = ['dollar_volume']
        
        
        # Use coerce parameter = True to convert data objects to numbers            
        data_df = pd.DataFrame.from_records(
            stock_data, 
            index=stock_symbols, 
            columns=column_names, 
            coerce_float=True)
        
        # Use pandas methods to select the assets we want
        # First find the values of dollar_volume at our desired bounds
        lower_percent = data_df.dollar_volume.quantile(self.LowVar)
        upper_percent = data_df.dollar_volume.quantile(self.HighVar)
        
        # Now simply query using those values
        # Filter for has_fundamentals to remove ETFs
        my_universe = (data_df.
            query('(dollar_volume >= @lower_percent) & (dollar_volume <= @upper_percent)'))
        
        # See how many securities are found in our universe
        self.data.Debug("{} securities found ".format(my_universe.shape[0]))
        
        # Expects a list of symbols returned
        return my_universe.index.tolist()

I think it's slightly more efficient than your code, especially if QC has difficulties with dataframes, because instead of adding 3 columns of data from all ~6000 stocks into a dataframe, first it narrows down the list of desired stocks by .hasfundamentaldata and .price before it adds them into a dataframe, since those are easy to do without one.

It actually sped up my backtest signifigantly because I was able to narrow down my coarse universe much more before getting to the fine fundamental data. Although still at 4k data points per second.

Upvote

Dan Whitnable INVESTOR

October 2017

Lukel, great approach. Narrowing down the selection with an 'if' conditoin in the initial iterator is nice. Get's out all the securities you don't want right away.

The other benefit of this approach is that the dollar volume range is calculated only accross the stocks of interest and not the entire course universe. That was a flaw in my original logic.

You may want to look at the 'context.UniverseSettings.MinimumTimeInUniverse' property of the universe. It's set in the 'Initialize' section. I had it set to 0 but maybe consider a higher value. This should keep securities from moving in and out so much. Makes them 'sticky'. This is a bit more important in QC rather than Quantopian because the 'dollar_volume' in QC is for a single day and could vary a lot. Quantopian allowed one to take an average dollar_volume (you noted in your intial post you had used a window_length of 3). Just a thought.

Upvote

HanByul P INVESTOR

October 2017

Lukel, Dan, Great work. @Dan, Glad to see you here at QC. Hope we Quantopian migrants get used here at QC as fast as we can, and develop what we want. Thanks guys ! :)

Upvote

LukeI INVESTOR

October 2017

Dan,

Since my strategy isn't a long term holder, more of a swing strategy, it needs a fresh list of stocks that meet the critera every day. I just read about the change to dollar volume. I had no idea that the original way was a 30 day EMA, a premade EMA isn't very flexible but is preferable to a single day dollar volume if you can't "warm up" the dollar volume and are stuck with waiting 30 days to get an accurate representation of stocks that you would have wanted to purchase on day 1. If that's the case I don't even know how you would go about storing the dollar volume data of 5000 stocks to detect when one eventually meets the long term moving average criteria. My original dollar volume filter is MUCH longer than 3 days, but I don't know if it's a very sensitive parameter or not. Guess I will find out as I go.

Upvote

Author:

October 2017

Platform

Radically Open-Source Algorithmic Trading Engine

Join Our Discord Channel

Quarterly Open-Source Trading Competition

Draft Discussions

Bookmarked Discussions

SEARCH DISCUSSIONS

376,300 Quants.

VOTE FOR UPCOMING FEATURES

JOIN OUR Community MAILING LIST

IN THIS RESEARCH

PARTICIPANTS

Actions

Join QuantConnect for Free

Platform

SIGN IN

Radically Open-Source Algorithmic Trading Engine

Join Our Discord Channel

Quarterly Open-Source Trading Competition

Draft Discussions

Bookmarked Discussions

SEARCH DISCUSSIONS

376,300 Quants.

VOTE FOR UPCOMING FEATURES

JOIN OUR Community MAILING LIST

IN THIS RESEARCH

PARTICIPANTS

SHARE RESEARCH

SHARE DISCUSSION

SHARE ARTICLE

SHARE

Actions

Join QuantConnect for Free