The example that the OP gives will give almost the biggest difference in performance between slicing and not slicing.
If processing actually does something that takes a considerable amount of time, the problem can hardly exist.
The fact of OP is to tell us what is happening. The most likely scenario is something important, so he must write his code.
Adapted from the op example:
#slice_time.py import time import string text = string.letters * 1000 import random indices = range(len(text)) random.shuffle(indices) import re def greater_processing(a_string): results = re.findall('m', a_string) def medium_processing(a_string): return re.search('m.*?m', a_string) def lesser_processing(a_string): return re.match('m', a_string) def least_processing(a_string): return a_string def timeit(fn, processor): t1 = time.time() for i in indices: fn(i, i + 1000, processor) t2 = time.time() print '%s took %0.3f ms %s' % (fn.func_name, (t2-t1) * 1000, processor.__name__) def test_part_slice(i, j, processor): return processor(text[i:j]) def test_copy(i, j, processor): return processor(text[:]) def test_text(i, j, processor): return processor(text) def test_buffer(i, j, processor): return processor(buffer(text, i, j - i)) if __name__ == '__main__': processors = [least_processing, lesser_processing, medium_processing, greater_processing] tests = [test_part_slice, test_copy, test_text, test_buffer] for processor in processors: for test in tests: timeit(test, processor)
And then run ...
In [494]: run slice_time.py test_part_slice took 68.264 ms least_processing test_copy took 42.988 ms least_processing test_text took 33.075 ms least_processing test_buffer took 76.770 ms least_processing test_part_slice took 270.038 ms lesser_processing test_copy took 197.681 ms lesser_processing test_text took 196.716 ms lesser_processing test_buffer took 262.288 ms lesser_processing test_part_slice took 416.072 ms medium_processing test_copy took 352.254 ms medium_processing test_text took 337.971 ms medium_processing test_buffer took 438.683 ms medium_processing test_part_slice took 502.069 ms greater_processing test_copy took 8149.231 ms greater_processing test_text took 8292.333 ms greater_processing test_buffer took 563.009 ms greater_processing
Notes:
Yes, I tried OP original test_1 with the slice [i:] and it is much slower, making its test even shorter.
Interestingly, the buffer almost always runs a little slower than slicing. This time there is one where it is better! However, the real test, although lower, and the buffer seem to be better suited for large substrings, and slicing is better for smaller substrings.
And yes, I have some accidents in this test, so do the test and look at the different results :). It may also be interesting to resize 1000.
So, maybe some others believe you, but I donβt , so I would like to know something about what processes and how you came to the conclusion: " slicing is a problem. "
I processed the average processing in my example and increased the string.letters multiplier to 100000 and increased the length of the slices to 10000. Also below is a fragment of length 100. I used cProfile for them (much less utility data, then profile!).
test_part_slice took 77338.285 ms medium_processing 31200019 function calls in 77.338 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 77.338 77.338 <string>:1(<module>) 2 0.000 0.000 0.000 0.000 iostream.py:63(write) 5200000 8.208 0.000 43.823 0.000 re.py:139(search) 5200000 9.205 0.000 12.897 0.000 re.py:228(_compile) 5200000 5.651 0.000 49.475 0.000 slice_time.py:15(medium_processing) 1 7.901 7.901 77.338 77.338 slice_time.py:24(timeit) 5200000 19.963 0.000 69.438 0.000 slice_time.py:31(test_part_slice) 2 0.000 0.000 0.000 0.000 utf_8.py:15(decode) 2 0.000 0.000 0.000 0.000 {_codecs.utf_8_decode} 2 0.000 0.000 0.000 0.000 {isinstance} 2 0.000 0.000 0.000 0.000 {method 'decode' of 'str' objects} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 5200000 3.692 0.000 3.692 0.000 {method 'get' of 'dict' objects} 5200000 22.718 0.000 22.718 0.000 {method 'search' of '_sre.SRE_Pattern' objects} 2 0.000 0.000 0.000 0.000 {method 'write' of '_io.StringIO' objects} 4 0.000 0.000 0.000 0.000 {time.time} test_buffer took 58067.440 ms medium_processing 31200103 function calls in 58.068 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 58.068 58.068 <string>:1(<module>) 3 0.000 0.000 0.000 0.000 __init__.py:185(dumps) 3 0.000 0.000 0.000 0.000 encoder.py:102(__init__) 3 0.000 0.000 0.000 0.000 encoder.py:180(encode) 3 0.000 0.000 0.000 0.000 encoder.py:206(iterencode) 1 0.000 0.000 0.001 0.001 iostream.py:37(flush) 2 0.000 0.000 0.001 0.000 iostream.py:63(write) 1 0.000 0.000 0.000 0.000 iostream.py:86(_new_buffer) 3 0.000 0.000 0.000 0.000 jsonapi.py:57(_squash_unicode) 3 0.000 0.000 0.000 0.000 jsonapi.py:69(dumps) 2 0.000 0.000 0.000 0.000 jsonutil.py:78(date_default) 1 0.000 0.000 0.000 0.000 os.py:743(urandom) 5200000 6.814 0.000 39.110 0.000 re.py:139(search) 5200000 7.853 0.000 10.878 0.000 re.py:228(_compile) 1 0.000 0.000 0.000 0.000 session.py:149(msg_header) 1 0.000 0.000 0.000 0.000 session.py:153(extract_header) 1 0.000 0.000 0.000 0.000 session.py:315(msg_id) 1 0.000 0.000 0.000 0.000 session.py:350(msg_header) 1 0.000 0.000 0.000 0.000 session.py:353(msg) 1 0.000 0.000 0.000 0.000 session.py:370(sign) 1 0.000 0.000 0.000 0.000 session.py:385(serialize) 1 0.000 0.000 0.001 0.001 session.py:437(send) 3 0.000 0.000 0.000 0.000 session.py:75(<lambda>) 5200000 4.732 0.000 43.842 0.000 slice_time.py:15(medium_processing) 1 5.423 5.423 58.068 58.068 slice_time.py:24(timeit) 5200000 8.802 0.000 52.645 0.000 slice_time.py:40(test_buffer) 7 0.000 0.000 0.000 0.000 traitlets.py:268(__get__) 2 0.000 0.000 0.000 0.000 utf_8.py:15(decode) 1 0.000 0.000 0.000 0.000 uuid.py:101(__init__) 1 0.000 0.000 0.000 0.000 uuid.py:197(__str__) 1 0.000 0.000 0.000 0.000 uuid.py:531(uuid4) 2 0.000 0.000 0.000 0.000 {_codecs.utf_8_decode} 1 0.000 0.000 0.000 0.000 {built-in method now} 18 0.000 0.000 0.000 0.000 {isinstance} 4 0.000 0.000 0.000 0.000 {len} 1 0.000 0.000 0.000 0.000 {locals} 1 0.000 0.000 0.000 0.000 {map} 2 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects} 1 0.000 0.000 0.000 0.000 {method 'close' of '_io.StringIO' objects} 1 0.000 0.000 0.000 0.000 {method 'count' of 'list' objects} 2 0.000 0.000 0.000 0.000 {method 'decode' of 'str' objects} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 1 0.000 0.000 0.000 0.000 {method 'extend' of 'list' objects} 5200001 3.025 0.000 3.025 0.000 {method 'get' of 'dict' objects} 1 0.000 0.000 0.000 0.000 {method 'getvalue' of '_io.StringIO' objects} 3 0.000 0.000 0.000 0.000 {method 'join' of 'str' objects} 5200000 21.418 0.000 21.418 0.000 {method 'search' of '_sre.SRE_Pattern' objects} 1 0.000 0.000 0.000 0.000 {method 'send_multipart' of 'zmq.core.socket.Socket' objects} 2 0.000 0.000 0.000 0.000 {method 'strftime' of 'datetime.date' objects} 1 0.000 0.000 0.000 0.000 {method 'update' of 'dict' objects} 2 0.000 0.000 0.000 0.000 {method 'write' of '_io.StringIO' objects} 1 0.000 0.000 0.000 0.000 {posix.close} 1 0.000 0.000 0.000 0.000 {posix.open} 1 0.000 0.000 0.000 0.000 {posix.read} 4 0.000 0.000 0.000 0.000 {time.time}
Smaller fragments (length 100).
test_part_slice took 54916.153 ms medium_processing 31200019 function calls in 54.916 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 54.916 54.916 <string>:1(<module>) 2 0.000 0.000 0.000 0.000 iostream.py:63(write) 5200000 6.788 0.000 38.312 0.000 re.py:139(search) 5200000 8.014 0.000 11.257 0.000 re.py:228(_compile) 5200000 4.722 0.000 43.034 0.000 slice_time.py:15(medium_processing) 1 5.594 5.594 54.916 54.916 slice_time.py:24(timeit) 5200000 6.288 0.000 49.322 0.000 slice_time.py:31(test_part_slice) 2 0.000 0.000 0.000 0.000 utf_8.py:15(decode) 2 0.000 0.000 0.000 0.000 {_codecs.utf_8_decode} 2 0.000 0.000 0.000 0.000 {isinstance} 2 0.000 0.000 0.000 0.000 {method 'decode' of 'str' objects} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 5200000 3.242 0.000 3.242 0.000 {method 'get' of 'dict' objects} 5200000 20.268 0.000 20.268 0.000 {method 'search' of '_sre.SRE_Pattern' objects} 2 0.000 0.000 0.000 0.000 {method 'write' of '_io.StringIO' objects} 4 0.000 0.000 0.000 0.000 {time.time} test_buffer took 62019.684 ms medium_processing 31200103 function calls in 62.020 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 62.020 62.020 <string>:1(<module>) 3 0.000 0.000 0.000 0.000 __init__.py:185(dumps) 3 0.000 0.000 0.000 0.000 encoder.py:102(__init__) 3 0.000 0.000 0.000 0.000 encoder.py:180(encode) 3 0.000 0.000 0.000 0.000 encoder.py:206(iterencode) 1 0.000 0.000 0.001 0.001 iostream.py:37(flush) 2 0.000 0.000 0.001 0.000 iostream.py:63(write) 1 0.000 0.000 0.000 0.000 iostream.py:86(_new_buffer) 3 0.000 0.000 0.000 0.000 jsonapi.py:57(_squash_unicode) 3 0.000 0.000 0.000 0.000 jsonapi.py:69(dumps) 2 0.000 0.000 0.000 0.000 jsonutil.py:78(date_default) 1 0.000 0.000 0.000 0.000 os.py:743(urandom) 5200000 7.426 0.000 41.152 0.000 re.py:139(search) 5200000 8.470 0.000 11.628 0.000 re.py:228(_compile) 1 0.000 0.000 0.000 0.000 session.py:149(msg_header) 1 0.000 0.000 0.000 0.000 session.py:153(extract_header) 1 0.000 0.000 0.000 0.000 session.py:315(msg_id) 1 0.000 0.000 0.000 0.000 session.py:350(msg_header) 1 0.000 0.000 0.000 0.000 session.py:353(msg) 1 0.000 0.000 0.000 0.000 session.py:370(sign) 1 0.000 0.000 0.000 0.000 session.py:385(serialize) 1 0.000 0.000 0.001 0.001 session.py:437(send) 3 0.000 0.000 0.000 0.000 session.py:75(<lambda>) 5200000 5.399 0.000 46.551 0.000 slice_time.py:15(medium_processing) 1 5.958 5.958 62.020 62.020 slice_time.py:24(timeit) 5200000 9.510 0.000 56.061 0.000 slice_time.py:40(test_buffer) 7 0.000 0.000 0.000 0.000 traitlets.py:268(__get__) 2 0.000 0.000 0.000 0.000 utf_8.py:15(decode) 1 0.000 0.000 0.000 0.000 uuid.py:101(__init__) 1 0.000 0.000 0.000 0.000 uuid.py:197(__str__) 1 0.000 0.000 0.000 0.000 uuid.py:531(uuid4) 2 0.000 0.000 0.000 0.000 {_codecs.utf_8_decode} 1 0.000 0.000 0.000 0.000 {built-in method now} 18 0.000 0.000 0.000 0.000 {isinstance} 4 0.000 0.000 0.000 0.000 {len} 1 0.000 0.000 0.000 0.000 {locals} 1 0.000 0.000 0.000 0.000 {map} 2 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects} 1 0.000 0.000 0.000 0.000 {method 'close' of '_io.StringIO' objects} 1 0.000 0.000 0.000 0.000 {method 'count' of 'list' objects} 2 0.000 0.000 0.000 0.000 {method 'decode' of 'str' objects} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 1 0.000 0.000 0.000 0.000 {method 'extend' of 'list' objects} 5200001 3.158 0.000 3.158 0.000 {method 'get' of 'dict' objects} 1 0.000 0.000 0.000 0.000 {method 'getvalue' of '_io.StringIO' objects} 3 0.000 0.000 0.000 0.000 {method 'join' of 'str' objects} 5200000 22.097 0.000 22.097 0.000 {method 'search' of '_sre.SRE_Pattern' objects} 1 0.000 0.000 0.000 0.000 {method 'send_multipart' of 'zmq.core.socket.Socket' objects} 2 0.000 0.000 0.000 0.000 {method 'strftime' of 'datetime.date' objects} 1 0.000 0.000 0.000 0.000 {method 'update' of 'dict' objects} 2 0.000 0.000 0.000 0.000 {method 'write' of '_io.StringIO' objects} 1 0.000 0.000 0.000 0.000 {posix.close} 1 0.000 0.000 0.000 0.000 {posix.open} 1 0.000 0.000 0.000 0.000 {posix.read} 4 0.000 0.000 0.000 0.000 {time.time}