Vectorization in Python

Hi Quants,

I'll be brief here.

I'd like to introduce you to vectorization in general and in Python.

What is vectorization?

In computer science, vectorization is applying the mathematics of vectors and their operations to arrays (people using Python know arrays as Lists) instead of using “for loops over arrays” to perform calculations.

Why using vectorization?

If you are using a computer from before 1995-ish, yes, you do not need to care too much about vectorization because your processor much likely does not really support true parallelism (multi-cores) and super-fast SIMD instructions on memory locations.

If you are using a computer from after 2000-ish, which I bet you do, then vectorization would make use of incredibly performant instructions in your processor AND would make use of those cores you are proud to have. So use true parallelism while doing calculations in your backtests/indicators. You can use it on your NVIDIA cards also, to use them for backtesting at just amazing blazing speeds, but I do not want to err there in this post.

Note that if you use QC web/cloud, you are using computers from after 2000……………………………… 😊. So vectorization is in scope for you.

How to use vectorization?

This is a college-level subject, even graduate-studies level on many aspects, so I will not fully teach how to use it in a forum post. Google is your friend. Numpy and Pandas, which are both included in the QC cloud environment so you can use them, are also very close friends here. They do have vectorization implemented for you. Look on Google. And, as you probably know, self.History() returns a Pandas DataFrame. So you can use vectorization right away while exploiting its results. Compared to using a for loop to iterate over the rows of the DataFrame, you'll gain 20-50-100-200-1000X…………

You could have a read of this:

and this:

As starters if you are more interested.

Pandas is incredibly vast and powerful. Even Numpy is orders of magnitude more powerful than most people make use of it. Think that vectorization is one of the foundations of data analytics. Now if backtesting and algo-trading are not data analytics…

SHOW ME CODE! SHOW ME COOOOOOOOOOOOOODE!!!!!!!!!!!!

Alright.

I did a very easy to understand comparison, to put things in perspective and eventually spark your interest.

Have a look at this:

Timer unit: 1e-06 s
Total time: 226.546 s
File: test20.py
Function: f1 at line 6
Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     6                                           @profile
     7                                           def f1():
     8      1000  186045965.8 186046.0     82.1      l1 = random.sample(range(1, 1_000_000), 50_000)
     9      1000       4287.7      4.3      0.0      _sum = 0
    10  50001000   18733153.4      0.4      8.3      for item in l1:
    11  50000000   21757529.3      0.4      9.6          _sum += item
    12      1000       4978.6      5.0      0.0      return _sum / len(l1)
Total time: 165.398 s
File: test20.py
Function: f2 at line 14
Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    14                                           @profile
    15                                           def f2():
    16      1000  164171321.4 164171.3     99.3      l1 = random.sample(range(1, 1_000_000), 50_000)
    17      1000    1223668.1   1223.7      0.7      _sum = sum(l1)
    18      1000       3147.0      3.1      0.0      return _sum / len(l1)
Total time: 168.975 s
File: test20.py
Function: f3 at line 20
Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    20                                           @profile
    21                                           def f3():
    22      1000  168338808.0 168338.8     99.6      l1 = random.sample(range(1, 1_000_000), 50_000)
    23      1000     632884.5    632.9      0.4      _sum = math.fsum(l1)
    24      1000       3317.4      3.3      0.0      return _sum / len(l1)
Total time: 268.41 s
File: test20.py
Function: f4 at line 26
Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    26                                           @profile
    27                                           def f4():
    28      1000  167759822.9 167759.8     62.5      l1 = random.sample(range(1, 1_000_000), 50_000)
    29      1000  100650675.6 100650.7     37.5      return statistics.mean(l1)
Total time: 165.881 s
File: test20.py
Function: f5 at line 31
Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    31                                           @profile
    32                                           def f5():
    33      1000  165213090.9 165213.1     99.6      l1 = random.sample(range(1, 1_000_000), 50_000)
    34      1000     668202.6    668.2      0.4      return statistics.fmean(l1)
Total time: 172.987 s
File: test20.py
Function: f6 at line 36
Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    36                                           @profile
    37                                           def f6():
    38      1000  169385695.9 169385.7     97.9      l1 = random.sample(range(1, 1_000_000), 50_000)
    39      1000    3595975.9   3596.0      2.1      _sum = numpy.sum(l1)
    40      1000       4905.4      4.9      0.0      return _sum / len(l1)
Total time: 0.380018 s
File: test20.py
Function: f7 at line 42
Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    42                                           @profile
    43                                           def f7():
    44      1000     338150.1    338.2     89.0      l1 = numpy.random.randint(1_000_000, size=50_000)
    45      1000      40276.7     40.3     10.6      _sum = numpy.sum(l1)
    46      1000       1591.3      1.6      0.4      return _sum / len(l1)
+ Expand
- Collapse

I compute a mean/average in many ways. Is computing averages something you do in your algos? 😊

f1 : The sum is calculated with a “for loop”. Then the sum is divided by the number of items. Very classic huh?

f2 : I use the built-in sum() Python function.

f3 : I use the math.fsum() function.

f4 : I use statistics.mean() directly.

f5 : I use statistics.fmean().

f6 : I use numpy.sum() to compute the sum on a Python List.

f7 : I fully use vectorization implemented into Numpy.

Time needed to execute each function 1000x:

f1 = 226s

f2 = 165s

f3 = 168s

f4 = 268s

f5 = 165s

f6 = 172s

f7 = ……………………………. drum roll, 0.38s

Now, let's say I am telling you that using vectorization over Pandas DataFrame gives even bigger gains? We're talking about operations over vectors there but also, of course, over matrices. There are specialized CPU SIMD instructions that Pandas uses for operations over matrices for which gains over “for loops” are even bigger.

By now, I guess you understand that vectorization means you can crunch a lot more data for a lot more symbols in a lot less time.

Hope you enjoyed reading this.

Fred

Hi Fred Painchaud,

Mind-blowing!
I would assume that statistics would use numpy under the hood because it's known/designed to be fast. If it does, it adds an incredible overhead.

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.

Hi Alex,

I did not look at statistics' implementation but from the CPython implementation I've seen and some other libraries, I would bet it's based on classic iteration over objects (linked lists to be specific). CPython is very very orthogonal in that respect - which is all perfectly fine. That assessment is also based on timings which are close to standard for (first one). But I was not looking at statistics, I was looking at numpy. And that post of mine is actually outdated re my research. I'm even far faster than that now on my standard laptop with, yes, numpy + proper vectorization, but also with numba and ray.

Fred

Alexandre Catarino INVESTOR

QuantConnect | April 2022

Upvote

Fred Painchaud INVESTOR

April 2022

1 person upvoted this

Hi All,

I forgot to mention it in my post because it is rule #0 (yeah, even before rule #1) so while it was not obvious to me at all when I started to study Numpy et al. and vectorization as a whole, it is now totally implicit to me (hence me forgetting to mention it):

If you use Numpy to store Python objects, and thus, your numpy array (ndarray) is of type (dtype) “object", 1) you are not using vectorization and 2) you are actually slowing down your algo even more.

All the above is applicable to numpy arrays of scalars (numbers). Hence, to plain ints and floats. Since we work with prices and volumes, mostly, here floats are more present than ints (use floats even for your volumes - crypto “volumes” are floats anyway).

So, in other words, don't expect to throw TradeBars into Numpy arrays and see a gain in performance. You should see a decrease in performance then.

No, working with all that takes more experience with programming and computer science in general. It also means you will use much less of what is built into LEAN for you. For instance, I coded a function to resample a time series into another - i.e., to consolidate bars, say, of 1s into 5s, or 1m, or 1h, or 2.5h, or 1m into 2m, etc. Hence, I don't use LEAN's consolidators anymore. Just to give you one example.

The gain is that consolidating say 1 year of 1s bars (that's millions of bars of course, especially with crypto) into 5s bars is almost instantaneous (say 1 second max), on a very very standard laptop. I'm not including here downloading the 1s data or getting it from History (at least in the cloud - I'll eventually have a look at how it is implemented for Python because even though there's a conversion to Pandas DF, it looks slow - anyway). Which means that for one symbol, generating 25 different resolutions from 1s bar data takes approx 6-7 seconds on my laptop. Which means I can do it for portfolios/universes of dozens and dozens of symbols, which means I can then trade them with very solid MTF (multi-timeframe), which means you can then code strategies that know much better where the symbols are at, which means that code can also use vectorization et al., which means the strategy that was first impossible to do in LEAN (at least in Python) because of the shear amount of data to process fast enough (my initial test took hours just to consolidate one symbol and I stopped it before that consolidation was done) is now taking just below 10 minutes per symbol. And that is the long scenario in fact because that's when the symbol is new and needs to be primed. Afterwards, the tracking/trading of the symbol is instantaneous (well, once the data comes in…. daily bars are daily bars). It's been a game changer in what I've been able to do.

A-n-y-w-a-y. All this is simply about giving you awareness of those libraries/technologies/etc.

Newoptionz INVESTOR

Stephen Williams | May 2022

Fred Painchaud it souds like you do not code ni the default browser coding environmnet, you say you are using a standard computer, so you probably are coding in in Visual Studio using the lean engine?

May 2022

Hi Newoptionz,

Don't worry too much about that since it does not really matter, it is mostly a question of preferences. But yes, I code in Visual Studio 2022 (for LEAN), PyCharm (for Python algos) and even IDLE (for Python algos) atm.

But you can use vectorization in the browser IDE. The libraries I am using are installed there, yes, they are not up-to-date, but the basis works ok still. Using updated libraries will be easier once the related architectural changes are done in LEAN.

Platform

Radically Open-Source Algorithmic Trading Engine

Join Our Discord Channel

Quarterly Open-Source Trading Competition

Draft Discussions

Bookmarked Discussions

SEARCH DISCUSSIONS

373,400 Quants.

VOTE FOR UPCOMING FEATURES

JOIN OUR Community MAILING LIST

IN THIS RESEARCH

PARTICIPANTS

Discussion Awards

Actions

Join QuantConnect for Free