As an answer to my question, Find a position based on 1 in which the two lists are the same . I got a hint to use the C-library itertools to speed things up.
To check, I encoded the following test using cProfile:
from itertools import takewhile, izip def match_iter(self, other): return sum(1 for x in takewhile(lambda x: x[0] == x[1], izip(self, other))) def match_loop(self, other): element = -1 for element in range(min(len(self), len(other))): if self[element] != other[element]: element -= 1 break return element +1 def test(): a = [0, 1, 2, 3, 4] b = [0, 1, 2, 3, 4, 0] print("match_loop a=%s, b=%s, result=%s" % (a, b, match_loop(a, b))) print("match_iter a=%s, b=%s, result=%s" % (a, b, match_iter(a, b))) i = 10000 while i > 0: i -= 1 match_loop(a, b) match_iter(a, b) def profile_test(): import cProfile cProfile.run('test()') if __name__ == '__main__': profile_test()
The match_iter () function uses itertools, and the match_loop () function is the one I implemented before using simple python.
The test () function defines two lists, prints lists with the results of two functions to check if it works. Both results have an expected value of 5, which is the length for equal lists. He then performs 10,000 times on both functions.
Finally, all of this is profiled with profile_test ().
I found out that izip is not implemented in itertools python3, at least not in the debian wheezy whitch that I use. So I checked the test with python2.7
Here are the results:
python2.7 match_test.py match_loop a=[0, 1, 2, 3, 4], b=[0, 1, 2, 3, 4, 0], result=5 match_iter a=[0, 1, 2, 3, 4], b=[0, 1, 2, 3, 4, 0], result=5 180021 function calls in 0.636 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.636 0.636 <string>:1(<module>) 1 0.039 0.039 0.636 0.636 match_test.py:15(test) 10001 0.048 0.000 0.434 0.000 match_test.py:3(match_iter) 60006 0.188 0.000 0.275 0.000 match_test.py:4(<genexpr>) 50005 0.087 0.000 0.087 0.000 match_test.py:4(<lambda>) 10001 0.099 0.000 0.162 0.000 match_test.py:7(match_loop) 20002 0.028 0.000 0.028 0.000 {len} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 10001 0.018 0.000 0.018 0.000 {min} 10001 0.018 0.000 0.018 0.000 {range} 10001 0.111 0.000 0.387 0.000 {sum}
Which makes me wonder when looking at cumtime values, my simple python version has a value of 0.162 seconds for 10,000 cycles, and the match_iter version takes 0.434 seconds.
On the one hand, python is very fast, great, so I don't have to worry. But can it be right that the C-library takes more than two times to complete the work as simple Python code? Or am I making a fatal mistake?
To verify that I also ran the test using python2.6, which seems to be even faster, but with the same difference between loops and itertools.
Who is experienced and wants to help?