Hi Quants,
I'll be brief here.
I'd like to introduce you to vectorization in general and in Python.
What is vectorization?
In computer science, vectorization is applying the mathematics of vectors and their operations to arrays (people using Python know arrays as Lists) instead of using “for loops over arrays” to perform calculations.
Why using vectorization?
If you are using a computer from before 1995-ish, yes, you do not need to care too much about vectorization because your processor much likely does not really support true parallelism (multi-cores) and super-fast SIMD instructions on memory locations.
If you are using a computer from after 2000-ish, which I bet you do, then vectorization would make use of incredibly performant instructions in your processor AND would make use of those cores you are proud to have. So use true parallelism while doing calculations in your backtests/indicators. You can use it on your NVIDIA cards also, to use them for backtesting at just amazing blazing speeds, but I do not want to err there in this post.
Note that if you use QC web/cloud, you are using computers from after 2000……………………………… 😊. So vectorization is in scope for you.
How to use vectorization?
This is a college-level subject, even graduate-studies level on many aspects, so I will not fully teach how to use it in a forum post. Google is your friend. Numpy and Pandas, which are both included in the QC cloud environment so you can use them, are also very close friends here. They do have vectorization implemented for you. Look on Google. And, as you probably know, self.History() returns a Pandas DataFrame. So you can use vectorization right away while exploiting its results. Compared to using a for loop to iterate over the rows of the DataFrame, you'll gain 20-50-100-200-1000X…………
You could have a read of this:
and this:
As starters if you are more interested.
Pandas is incredibly vast and powerful. Even Numpy is orders of magnitude more powerful than most people make use of it. Think that vectorization is one of the foundations of data analytics. Now if backtesting and algo-trading are not data analytics…
SHOW ME CODE! SHOW ME COOOOOOOOOOOOOODE!!!!!!!!!!!!
Alright.
I did a very easy to understand comparison, to put things in perspective and eventually spark your interest.
Have a look at this:
Timer unit: 1e-06 s
Total time: 226.546 s
File: test20.py
Function: f1 at line 6
Line # Hits Time Per Hit % Time Line Contents
==============================================================
6 @profile
7 def f1():
8 1000 186045965.8 186046.0 82.1 l1 = random.sample(range(1, 1_000_000), 50_000)
9 1000 4287.7 4.3 0.0 _sum = 0
10 50001000 18733153.4 0.4 8.3 for item in l1:
11 50000000 21757529.3 0.4 9.6 _sum += item
12 1000 4978.6 5.0 0.0 return _sum / len(l1)
Total time: 165.398 s
File: test20.py
Function: f2 at line 14
Line # Hits Time Per Hit % Time Line Contents
==============================================================
14 @profile
15 def f2():
16 1000 164171321.4 164171.3 99.3 l1 = random.sample(range(1, 1_000_000), 50_000)
17 1000 1223668.1 1223.7 0.7 _sum = sum(l1)
18 1000 3147.0 3.1 0.0 return _sum / len(l1)
Total time: 168.975 s
File: test20.py
Function: f3 at line 20
Line # Hits Time Per Hit % Time Line Contents
==============================================================
20 @profile
21 def f3():
22 1000 168338808.0 168338.8 99.6 l1 = random.sample(range(1, 1_000_000), 50_000)
23 1000 632884.5 632.9 0.4 _sum = math.fsum(l1)
24 1000 3317.4 3.3 0.0 return _sum / len(l1)
Total time: 268.41 s
File: test20.py
Function: f4 at line 26
Line # Hits Time Per Hit % Time Line Contents
==============================================================
26 @profile
27 def f4():
28 1000 167759822.9 167759.8 62.5 l1 = random.sample(range(1, 1_000_000), 50_000)
29 1000 100650675.6 100650.7 37.5 return statistics.mean(l1)
Total time: 165.881 s
File: test20.py
Function: f5 at line 31
Line # Hits Time Per Hit % Time Line Contents
==============================================================
31 @profile
32 def f5():
33 1000 165213090.9 165213.1 99.6 l1 = random.sample(range(1, 1_000_000), 50_000)
34 1000 668202.6 668.2 0.4 return statistics.fmean(l1)
Total time: 172.987 s
File: test20.py
Function: f6 at line 36
Line # Hits Time Per Hit % Time Line Contents
==============================================================
36 @profile
37 def f6():
38 1000 169385695.9 169385.7 97.9 l1 = random.sample(range(1, 1_000_000), 50_000)
39 1000 3595975.9 3596.0 2.1 _sum = numpy.sum(l1)
40 1000 4905.4 4.9 0.0 return _sum / len(l1)
Total time: 0.380018 s
File: test20.py
Function: f7 at line 42
Line # Hits Time Per Hit % Time Line Contents
==============================================================
42 @profile
43 def f7():
44 1000 338150.1 338.2 89.0 l1 = numpy.random.randint(1_000_000, size=50_000)
45 1000 40276.7 40.3 10.6 _sum = numpy.sum(l1)
46 1000 1591.3 1.6 0.4 return _sum / len(l1)
I compute a mean/average in many ways. Is computing averages something you do in your algos? 😊
f1 : The sum is calculated with a “for loop”. Then the sum is divided by the number of items. Very classic huh?
f2 : I use the built-in sum() Python function.
f3 : I use the math.fsum() function.
f4 : I use statistics.mean() directly.
f5 : I use statistics.fmean().
f6 : I use numpy.sum() to compute the sum on a Python List.
f7 : I fully use vectorization implemented into Numpy.
Time needed to execute each function 1000x:
f1 = 226s
f2 = 165s
f3 = 168s
f4 = 268s
f5 = 165s
f6 = 172s
f7 = ……………………………. drum roll, 0.38s
Now, let's say I am telling you that using vectorization over Pandas DataFrame gives even bigger gains? We're talking about operations over vectors there but also, of course, over matrices. There are specialized CPU SIMD instructions that Pandas uses for operations over matrices for which gains over “for loops” are even bigger.
By now, I guess you understand that vectorization means you can crunch a lot more data for a lot more symbols in a lot less time.
Hope you enjoyed reading this.
Fred
Alexandre Catarino
Hi Fred Painchaud,
Mind-blowing!
I would assume that statistics would use numpy under the hood because it's known/designed to be fast. If it does, it adds an incredible overhead.
Fred Painchaud
Hi Alex,
I did not look at statistics' implementation but from the CPython implementation I've seen and some other libraries, I would bet it's based on classic iteration over objects (linked lists to be specific). CPython is very very orthogonal in that respect - which is all perfectly fine. That assessment is also based on timings which are close to standard for (first one). But I was not looking at statistics, I was looking at numpy. And that post of mine is actually outdated re my research. I'm even far faster than that now on my standard laptop with, yes, numpy + proper vectorization, but also with numba and ray.
Fred
Fred Painchaud
Hi All,
I forgot to mention it in my post because it is rule #0 (yeah, even before rule #1) so while it was not obvious to me at all when I started to study Numpy et al. and vectorization as a whole, it is now totally implicit to me (hence me forgetting to mention it):
If you use Numpy to store Python objects, and thus, your numpy array (ndarray) is of type (dtype) “object", 1) you are not using vectorization and 2) you are actually slowing down your algo even more.
All the above is applicable to numpy arrays of scalars (numbers). Hence, to plain ints and floats. Since we work with prices and volumes, mostly, here floats are more present than ints (use floats even for your volumes - crypto “volumes” are floats anyway).
So, in other words, don't expect to throw TradeBars into Numpy arrays and see a gain in performance. You should see a decrease in performance then.
No, working with all that takes more experience with programming and computer science in general. It also means you will use much less of what is built into LEAN for you. For instance, I coded a function to resample a time series into another - i.e., to consolidate bars, say, of 1s into 5s, or 1m, or 1h, or 2.5h, or 1m into 2m, etc. Hence, I don't use LEAN's consolidators anymore. Just to give you one example.
The gain is that consolidating say 1 year of 1s bars (that's millions of bars of course, especially with crypto) into 5s bars is almost instantaneous (say 1 second max), on a very very standard laptop. I'm not including here downloading the 1s data or getting it from History (at least in the cloud - I'll eventually have a look at how it is implemented for Python because even though there's a conversion to Pandas DF, it looks slow - anyway). Which means that for one symbol, generating 25 different resolutions from 1s bar data takes approx 6-7 seconds on my laptop. Which means I can do it for portfolios/universes of dozens and dozens of symbols, which means I can then trade them with very solid MTF (multi-timeframe), which means you can then code strategies that know much better where the symbols are at, which means that code can also use vectorization et al., which means the strategy that was first impossible to do in LEAN (at least in Python) because of the shear amount of data to process fast enough (my initial test took hours just to consolidate one symbol and I stopped it before that consolidation was done) is now taking just below 10 minutes per symbol. And that is the long scenario in fact because that's when the symbol is new and needs to be primed. Afterwards, the tracking/trading of the symbol is instantaneous (well, once the data comes in…. daily bars are daily bars). It's been a game changer in what I've been able to do.
A-n-y-w-a-y. All this is simply about giving you awareness of those libraries/technologies/etc.
Fred
Newoptionz
Fred Painchaud it souds like you do not code ni the default browser coding environmnet, you say you are using a standard computer, so you probably are coding in in Visual Studio using the lean engine?
Fred Painchaud
Hi Newoptionz,
Don't worry too much about that since it does not really matter, it is mostly a question of preferences. But yes, I code in Visual Studio 2022 (for LEAN), PyCharm (for Python algos) and even IDLE (for Python algos) atm.
But you can use vectorization in the browser IDE. The libraries I am using are installed there, yes, they are not up-to-date, but the basis works ok still. Using updated libraries will be easier once the related architectural changes are done in LEAN.
Fred
Fred Painchaud
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!