Python quicksort - view vs vs recursion list (subroutine)

I watched the conversation "Three beautiful speed" and fiddled with quick sorting. My python implementation was very similar to c (select pivot, split it and recursive into smaller and larger sections). Which I thought was not pythonic.

So this is an implementation using list comprehension in python.

def qsort(list): if list == []: return [] pivot = list[0] l = qsort([x for x in list[1:] if x < pivot]) u = qsort([x for x in list[1:] if x >= pivot]) return l + [pivot] + u 

Lets you call the meths qsortR recursion method. I have now noticed that qsortR is much slower than qsort for large (r) lists. In fact, "maximum recursion depth is exceeded in cmp" even for 1000 elements for the recursion method. Which i reset in sys.setrecursionlimit.

Some rooms:

 list-compr 1000 elems 0.491770029068 recursion 1000 elems 2.24620914459 list-compr 2000 elems 0.992327928543 recursion 2000 elems 7.72630095482 

All code is here .

I have a couple of questions:

  • Why is list comprehension so much faster?
  • Some enlightenment at the limit of recursion in python. I set it to 100000 first, in which cases should I be careful?
    • (What exactly is meant by tail tail recursion optimization, how is this done?)
  • Trying to sort 1,000,000 items clogs the memory of my laptop (with recursion method). What if I want to sort so many items? What optimizations are possible?
+7
source share
3 answers
  • Why is list comprehension so much faster?

    Since list comprehension implies a C-loop, which is much faster than the slow general way of using the Python for block.

  • Some enlightenment at the limit of recursion in python. I set it to 100000 first, in which cases should I be careful?

    If you run out of memory.

  • Trying to sort 1,000,000 items clogs my laptop's memory (using the recursion method). What if I want to sort so many items? What optimizations are possible?

    Python recursion gives such an overhead because each function call allocates a lot of stack space for each call.

    In general, iteration is the answer (will give better performance in statistically 99% of use cases).

    Speaking of memory structures, if you have simple data structures such as characters, integers, float: use the built-in array.array , which is much more memory efficient than list .

+9
source

Have you tried writing a non-recursive partition implementation? I suspect the performance difference is just a partition implementation. You do recursion for each element of your implementation.

Update

Here is a quick implementation. It is still not very fast or even efficient, but it is much better than the original recursive one.

 >>> def partition(data): ... pivot = data[0] ... less, equal, greater = [], [], [] ... for elm in data: ... if elm < pivot: ... less.append(elm) ... elif elm > pivot: ... greater.append(elm) ... else: ... equal.append(elm) ... return less, equal, greater ... >>> def qsort2(data): ... if data: ... less, equal, greater = partition(data) ... return qsort2(less) + equal + qsort2(greater) ... return data ... 

I also think that in the "traditional" version there are more temporary lists.

+1
source

Try comparing a list comprehension with an in-place algorithm when memory goes very large. The code below gets almost runtime when sorting integer numbers 100K, but when sorting integers 1M you will probably get into a solution to understand the list. I conducted tests using a 4Gb machine. Full code: http://snipt.org/Aaaje2

 class QSort: def __init__(self, lst): self.lst = lst def sorted(self): self.qsort_swap(0, len(self.lst)) return self.lst def qsort_swap(self, begin, end): if (end - begin) > 1: pivot = self.lst[begin] l = begin + 1 r = end while l < r: if self.lst[l] <= pivot: l += 1 else: r -= 1 self.lst[l], self.lst[r] = self.lst[r], self.lst[l] l -= 1 self.lst[begin], self.lst[l] = self.lst[l], self.lst[begin] # print begin, end, self.lst self.qsort_swap(begin, l) self.qsort_swap(r, end) 
+1
source

All Articles