Since the algorithm I want to implement uses indexes 1..n and because it is very error-prone to shift each index by one, I decided to get smart and inserted a dummy element at the beginning of each list, so I can use the original formula from the article .
For brevity, consider this toy example:
def calc(N): nums=[0]+range(1,N+1) return sum(nums[1:])
However, I was worried that my results were false, because I accidentally could access the 0th element and not know about it. Therefore, I became even smarter and used None instead of 0 as the first element - every arithmetic operation with it will lead to a runtime error:
def calc_safe(N): nums=[None]+range(1,N+1)
Surprisingly, this small change resulted in a huge performance hit for pypy (even with the current version 5.8) - the code became about 10 times slower! Here is just on my car:
pypy-5.8 cpython calc(10**8) 0.5 sec 5.5 sec calc_safe(10**8) 7.5 sec 5.5 sec
As a side to node: Cpython doesn't care if None or not.
So my question is twofold:
- Obviously using
None not a good idea, but why? - Is it possible to ensure the security of the
None approach and maintain performance?
Edit: As Armin explained, not all lists are equal, and we can see which strategy is used through:
import __pypy__ print __pypy__.strategy(nums)
In the first case, it is IntegerListStrategy and in the second ObjectListStrategy . The same thing happens if we use a large integer value (e.g. 2**100 ) instead of None .
performance python pypy
ead
source share