Work time using Big Θ notation

Question

Work time using Big Θ notation

def filter(f, lst): if lst == []: return [] if f(lst[0]): return [lst[0]] + filter(f, lst[1:]) return filter(f, lst[1:]) def my_reverse(lst): # Reverse the list def reverse_helper(x,y): if x == []: return y return reverse_helper(x[1:], [x[0]] + y) return reverse_helper(lst, []) def revfilter_alpha(f, lst): # Reverse and filter ... return my_reverse(filter(f, lst)) def revfilter_beta(f, lst): # Reverse and filter ... if lst == []: return [] return revfilter_beta(f, lst[1:]) + ([lst[0]] if f(lst[0]) else [])

Can someone explain to me how to determine the run time in Big ation format for them? I read a lot of things, but still do not know where to start.

In filter , I think it is Θ (n ^ 2) because it checks every element in a list of size n with the predicate function f with n recursive calls, so n * n.

revfilter_beta looks pretty similar, just reversible during filtering, so it won’t also be Θ (n ^ 2)?

revfilter_alpha then filters the opposite, so that it would not be n ^ 2 * n ^ 2 = Θ (n ^ 4)?

Does anyone have any thoughts?

+6

python algorithm analysis

kayla Oct 11 '12 at 1:32

source share

2 answers

All your functions are O(N^2) because they take O(N) time as a recursive step and there will be N steps in a list of length N

There are two expensive (i.e., O(N) ) operations that you perform in your functions. The first is slicing (for example, lst[1:] ). The second is the concatenation of the list (using the + operator).

Both of them can be more expensive than you expect, mainly because Python lists are not like list data types in other languages. Under the hood, these are arrays, not linked lists. The operations above linked lists can be done O (1) times (although O(1) slicing is destructive). In Lisp, for example, the algorithms you use would be O(N) , not O(N^2) .

Recursion is also often not optimal in Python because there is no tail call elimination . The default recursion limit for Python is 1000 in recent versions, so lists with long lists will interrupt purely recursive decisions if you do not combine in the sys module to increase the limit.

You can also create an O(N) version of these algorithms in Python, but you should avoid expensive list operations as much as possible. Instead of recursion, I suggest using generators, which are a much more “pythonic" programming style.

Filtering using a generator is very simple. The built-in filter function does this already, but you can write your own in just a few lines:

 def my_filter(f, iterable): for e in iterable: if f(e): yield e

Reversing the order of things is a little more complicated, because you need to either be able to randomly access the source, or use the extra O(N) space (your algorithm uses the stack for this space, even if the lists follow and can be randomly accessed). The built-in reversed function works only with sequences, but here is a version that works on any iterable (for example, another generator):

 def my_reversed(iterable): storage = list(iterable) # consumes all the input! for i in range(len(storage)-1, -1, -1): yield storage[i]

Please note that unlike many generators, this consumes all of its input immediately before it begins to produce a result. Do not run it on endless entry!

You can compose them in any order, and my_reversed(filter(f, lst)) should be equivalent to filter(f, my_reversed(lst)) (although for the latter, using the built-in reversed function is probably better).

The operating time of both generators above (and their composition in any order) will be O(N) .

+2

Blckknght Oct 11 '12 at 2:38

source share

Claudiu · Accepted Answer · 2012-10-11T01:39:39+0000

filter has recursive calls to n , but you also do a copy operation on each iteration that takes n , so you get Θ (n ^ 2). If you implemented it “correctly”, it should be Θ (n).

The same goes for my_reverse .

The same goes for revfilter_beta .

revfilter_alpha just does a filter and then a reverse , so Θ (n ^ 2 + n ^ 2) = Θ (n ^ 2).

EDIT: let's see filter bit more.

What do you want to find out how many operations are performed relative to the size of the input. O(n) means that in the worst case, you will execute the order of operations n . I say "in order" because you could, for example, perform O(n/2) or O(4n) operations, but n is the most important factor. That is, as n grows, the constant factor becomes less and less important, so we look at the inconsistent factor ( n in this case).

So how many operations does a filter in a list of size n ?

Take it from the bottom up. What if n equals 0 - empty list? Then it just returns an empty list. So, let's say that 1 operation.

What if n equals 1? It will check whether lst[0] should be enabled, so that the check is performed, however, it is required to call f - and then it will copy the rest of the list and make a recursive call to this copy, which in this case is an empty list. therefore filter(1) accepts the operations f + copy(0) + filter(0) , where copy(n) is the time it takes to copy the list, and f is the time it takes to check whether the item should be turned on if It takes the same amount of time for each item.

What about filter(2) ? It will perform 1 check, then copy the remaining list and call filter otherwise: f + copy(1) + filter(1) .

You already see the template. filter(n) takes 1 + copy(n-1) + filter(n-1) .

Now copy(n) is just n - to do this, perform operations n to sort the list. Therefore, we can simplify: filter(n) = f + n-1 + filter(n-1) .

Now you can try just extending filter(n-1) several times to find out what happens:

 filter(n) = f + n-1 + filter(n-1) = 1 + n-1 + (f + n-2 + filter(n-2)) = f + n-1 + f + n-2 + filter(n-2) = 2f + 2n-3 + filter(n-2) = 2f + 2n-3 + (f + n-3 + filter(n-3)) = 3f + 3n-6 + filter(n-3) = 3f + 3n-6 + (f + n-4 + filter(n-4)) = 4f + 4n-10 + filter(n-4) = 5f + 5n-15 + filter(n-5) ...

Is it possible to generalize for x reps? This sequence of 1, 3, 6, 10, 15 ... is the number of triangles, i.e. 1 , 1+2 , 1+2+3 , 1+2+3+4 , etc. The sum of all numbers from 1 to x is x*(x-1)/2 .

  = x*f + x*n - x*(x-1)/2 + filter(nx)

Now what is x ? How many reps will we have? Well, you can see that when x = n , you no longer have recursion - filter(nn) = filter(0) = 1 . So, our formula is now:

 filter(n) = n*f + n*n - n*(n-1)/2 + 1

What we can simplify further:

 filter(n) = n*f + n^2 - (n^2 - n)/2 + 1 = n*f + n^2 - n^2/2 + n/2 + 1 = n^2 - n^2/2 + f*n + n/2 + 1 = (1/2)n^2 + (f + 1/2)n + 1

So, I have this - a fairly detailed analysis. That would be Θ((1/2)n^2 + (f + 1/2)n + 1) ... assuming that f immaterial (say f = 1), which comes to Θ((1/2)n^2 + (3/2)n + 1) .

Now you will notice that if copy(n) took a constant amount of time instead of linear time (if copy(n) was 1 instead of n ), then you would not get that n^2 there.

I agree when I said Θ(n^2) initially, I did not do this all in my head. Rather, I realized: well, you have n recursive steps, and each step will take n amount of time due to copy . n*n = n^2 , thus Θ(n^2) . To do this more precisely, n is compressed at every step, so you really have n + (n-1) + (n-2) + (n-3) + ... + 1 , which ends with the same picture. as above: n*n - (1 + 2 + 3 + ... + n) = n*n - n*(n-1)/2 = (1/2)n^2 + (1/2)n , which is the same if I used 0 instead of f , above. Similarly, if you had steps n , but each step performed 1 instead of n (if you did not need to copy the list), you would have 1 + 1 + 1 + ... + 1 , n times or just n .

But this requires a little more intuition, so I decided that I would also show you the brute force method, which you can apply to everything.

Work time using Big Θ notation

More articles: