Hi,
This is a post that I believe can be helpful to fellow Quants here.
Many of you certainly use Lists in their Python algos. It's all fine. However, while working on a new data structure for my own Python algos, I realized something that I wanted to share:
When you slice Lists only to read the result (the slice), it is MUCH MORE efficient to use itertools.islice() instead of usual Python slicing, such as my_list[start:stop:step]. This is because slicing in Python creates a new List with the sliced elements in it. Indeed, a new List must be created (which takes longer than most people think) and the elements of the slice must be copied to the newly created List, which is then returned (this is only the references to the objects - so a shallow copy, not a deep copy - which is not too long). However, with itertools.islice(), an iterator over the slice is returned, and thus iteration is done on the original list, not a copy. This is very very fast (even if it is in O(n) - islice() only has to skip and count elements until it reaches your start index). Of course, as stated at the beginning of this paragraph, if you create the slice on purpose, because you need to work on a copy, for instance, if you want to alter the slice without altering the original List, then, slicing is the correct option. But if you slice only to read the resulting data in the slice, use itertools.islice() instead.
To give some examples, I am talking about doing this:
from itertools import islice
#Here, s is an iterator over the specified fragment of the original list 'l'
s = islice(l, start, stop, step)
for i in s:
#Do something
#You cannot write to s, doing something like:
#s[index] = ...
#because s is an iterator, not a List.
Instead of this:
s = l[start:stop:step]
for i in s:
#Do something.
#But in your algo, you are not writing to s / changing its contents,
#like with s[index] = ...
#You are only slicing to read the slice.
Here are some profiling results for convincing the doubtful:
Wrote profile results to test15.py.lprof
Timer unit: 1e-06 s
Total time: 84.6607 s
File: test15.py
Function: f1 at line 7
Line # Hits Time Per Hit % Time Line Contents
==============================================================
7 @profile
8 def f1():
9 1000001 506510.9 0.5 0.6 for _ in range(1_000_000):
10 1000000 5016767.1 5.0 5.9 i = random.randint(0, 49_999)
11 1000000 4095157.0 4.1 4.8 j = random.randint(0, 49_999)
12 #Here, s is a List, a copy of the specified fragment of the original list
13 1000000 75042277.5 75.0 88.6 s = l[min(i, j):max(i, j)]
Total time: 10.2587 s
File: test15.py
Function: f2 at line 15
Line # Hits Time Per Hit % Time Line Contents
==============================================================
15 @profile
16 def f2():
17 1000001 381841.1 0.4 3.7 for _ in range(1_000_000):
18 1000000 4451683.4 4.5 43.4 i = random.randint(0, 49_999)
19 1000000 4424653.7 4.4 43.1 j = random.randint(0, 49_999)
20 #Here, s is an iterator over the specified fragment of the original list
21 1000000 1000567.7 1.0 9.8 s = islice(l, min(i, j), max(i, j))
1 microsecond per islice() call vs 75 microseconds per slice.
Please note:
1- itertools.islice() does not handle negative indexes. Here is a replacement that I coded that handles them:
def islice_neg(iterable, *args):
start, stop, step = slice(*args).indices(len(iterable))
return itertools.islice(iterable, start, stop, step)
You can include it in your algos. You use it just like islice() if you want to include negative indexes (starting from the end of the structure).
2- This does not only apply to Lists but any other Python data structures that you slice only to read the result… islice them, or islice_neg them, instead.
Fred
Mak K
Thanks!
Very easy to understand and saves a lot of runtime🚀
Fred Painchaud
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!