It takes a long time to make all of these string snippets with long strings. This is at least O (N ^ 2) (since you are creating N lines of N lengths, and each of them must be copied into memory, taking the original data from the original), which destroys the overall performance and makes sorting inconsequential. Not to mention the memory requirements!
Instead of actually slicing a string, the next thought is to arrange the i values ββthat you use to create the circular rows in the order in which they are matched β without actually creating it. It turns out to be a bit complicated. (Removed / edited something here that was incorrect, see @TimPeters answer.)
The approach I made here is to bypass the standard library, which makes it difficult (though not impossible) to compare these strings on demand - and do my own sorting. The natural choice of the algorithm here is radix sorting , since in any case we need to consider strings one character at a time.
Let him tune in first. I am writing code for version 3.2, so the season is to my taste. (In particular, in 3.3 and higher, we could use yield from .) I use the following imports:
from random import choice from timeit import timeit from functools import partial
I wrote a universal number sorting function like this one:
def radix_sort(values, key, step=0): if len(values) < 2: for value in values: yield value return bins = {} for value in values: bins.setdefault(key(value, step), []).append(value) for k in sorted(bins.keys()): for r in radix_sort(bins[k], key, step + 1): yield r
Of course, we do not need to be universal (our "baskets" can only be marked with single characters, and presumably you really mean to apply the algorithm to the bytes sequence;)), but this will not hurt. Perhaps there is something reusable, right? In any case, the idea is simple: we process the base register, and then we throw each element into "bin" according to the result from the key function, and then we extract the values ββfrom the bins in a sorted buffer order, recursively sorting each contents of the basket.
The interface requires that key(value, n) give us n th the "radius" value . Therefore, for simple cases, for example, to compare strings directly, it can be as simple as lambda v, n: return v[n] . Here, however, the idea is to compare the indices in a row, according to the data in the row at that point (considered cyclically). So let's define the key:
def bw_key(text, value, step): return text[(value + step) % len(text)]
Now the trick to getting the right results is to remember that we conceptually concatenate the last characters of the strings that we are not actually creating. If we look at a virtual string created using index n , its last character is in index n - 1 , due to the way we wrap it, and for one moment it will think that this still works when n == 0 ;) . [However, when we complete the forwards, we still need to keep the string index within the bounds - hence, the modulo operation is in the key function.]
This is a common key function that must be passed to text , which it will refer to when converting value for comparison. What is where functools.partial comes in - you can also just mess around with lambda , but it is probably cleaner and I found it usually faster.
In any case, now we can easily record the actual conversion using the key:
def burroughs_wheeler_custom(text): return ''.join(text[i - 1] for i in radix_sort(range(len(text)), partial(bw_key, text)))
Nice and beautiful. Let's see how it turns out, right? We need a standard for comparison:
def burroughs_wheeler_standard(text): return ''.join([i[-1] for i in sorted([text[i:] + text[:i] for i in range(len(text))])])
And the synchronization procedure:
def test(n): data = ''.join(choice('abcdefghijklmnopqrstuvwxyz') for i in range(n)) + '$' custom = partial(burroughs_wheeler_custom, data) standard = partial(burroughs_wheeler_standard, data) assert custom() == standard() trials = 1000000
Pay attention to the math I did to solve a few trials inversely related to the length of the string test . This should support the total time spent on testing in a fairly narrow range - right ?;) (Wrong, of course, since we found that the standard algorithm is not less than O (N ^ 2).)
Let's see how this is done (* drumroll *):
>>> imp.reload(burroughs_wheeler) <module 'burroughs_wheeler' from 'burroughs_wheeler.py'> >>> burroughs_wheeler.test(100) custom: 4.7095093091438684 standard: 0.9819262643716229 >>> burroughs_wheeler.test(1000) custom: 5.532266880287807 standard: 2.1733253807396977 >>> burroughs_wheeler.test(10000) custom: 5.954826800612864 standard: 42.50686064849015
Hey, this is a little scary jump. Anyway, as you can see, the new approach adds a ton of overhead to short lines, but allows actual sorting to be a bottleneck instead of cutting lines. :)