I got the impression that startswith should be faster than in for the simple reason that in should do more checks (allows you to search for a word, anywhere on the line). But I had doubts, so I decided timeit . The code for the timings is given below, and, as you will probably notice, I did not do much time; the code is pretty simple.
import timeit setup1=''' def in_test(sent, word): if word in sent: return True else: return False ''' setup2=''' def startswith_test(sent, word): if sent.startswith(word): return True else: return False ''' print(timeit.timeit('in_test("this is a standard sentence", "this")', setup=setup1)) print(timeit.timeit('startswith_test("this is a standard sentence", "this")', setup=setup2))
Results:
>> in: 0.11912814951705597 >> startswith: 0.22812353561129417
So, startswith is twice as slow! .. I find this behavior very perplexing, considering what I said above. Am I doing something wrong with timing two or in really faster? If so, why?
Note that the results are very similar, even when both return False (in this case, in would have to actually go through the whole trick if it was just shorted earlier):
print(timeit.timeit('in_test("another standard sentence, would be that", "this")', setup=setup1)) print(timeit.timeit('startswith_test("another standard sentence, would be that", "this")', setup=setup2)) >> in: 0.12854891578786237 >> startswith: 0.2233201940338861
If I had to implement two functions from scratch, it would look something like this (pseudocode):
startswith : start comparing the letters of the word with the letters of the sentence one by one until the word ends) (return True) or b) the check returns False (return False)
in : call startswith for each position, where in the sentence you can find the initial letter of the word.
I just do not understand.
Just to make it clear, in and startswith not equivalent ; I'm just talking about the case when the word that is trying to find should be the first in the line.