Question about algorithmic analysis

NOTE. I am an ultra-newcomer to the analysis of algorithms, so do not accept any of my statements as absolute truths, nothing (or all), which, I claim, may be wrong.

Hi, I read about the analysis of algorithms and "Big-O-Notation", and I was puzzled by something.

Suppose you are asked to print all permutations of the char array, for [a, b, c] they will be ab, ac, ba, bc, ca and cb.


Well, one way to do this would be (in Java):

for(int i = 0; i < arr.length; i++) for(int q = 0; q < arr.length; q++) if(i != q) System.out.println(arr[i] + " " + arr[q]); 

This algorithm has the designation O (n 2 ) , if I'm right.


I was thinking of another way of doing this:

 for(int i = 0; i < arr.length; i++) for(int q = i+1; q < arr.length; q++) { System.out.println(arr[i] + " " + arr[q]); System.out.println(arr[q] + " " + arr[i]); } 

Now this algorithm is twice as fast as the original one , but if I am mistaken, for large O-notation it is also O ( 2 )


It is right? Perhaps this is not so, I will rephrase: where am I mistaken?

+4
source share
11 answers

You're right. O-notation gives you an idea of ​​how the algorithm scales, not absolute speed. If you add more features, both solutions will scale the same, but each will always be twice as fast as the other.

Operations

O (n) can also be slower than O (n ^ 2) operations for sufficiently small n. Imagine that your O (n) calculation involves taking 5 square roots, and your O (n ^ 2) solution is the only comparison. The O (n ^ 2) operation will be faster for small data sets. But when n = 1000, and you make 5000 square roots, but 1,000,000 comparisons, then O (n) may start to look better.

+13
source

I think most people agree with the first, this is O (n ^ 2). The outer loop runs n times, and the inner loop starts n times every time the outer loop executes. Thus, the runtime is O (n * n), O (n ^ 2).

The second is O (n ^ 2), since the outer loop runs n times. Inner loops run n-1 times. On average, for this algorithm, the inner loop is executed n / 2 times for each outer loop. therefore, the execution time of this algorithm is O (n * n / 2) => O (1/2 * n ^ 2) => O (n ^ 2).

+3
source

The Big-O designation says nothing about the speed of the algorithm, except how fast it relates to itself when the input size changes.

The algorithm may be O (1), but take a million years. Another algorithm may be O (n ^ 2), but be faster than O (n) for small n.

Some answers to this question may help in this aspect of the big O notation. Answers to this may also be useful.

+2
source

Ignoring the problem of invoking your permutation program:

Big-O-Notation ignores constant coefficients. And 2 is a constant coefficient.

So, there is nothing wrong with the fact that programs are twice as fast as the original, for the same O ()

+1
source

You're right. Two algorithms are equivalent in Big O notation if one of them takes longer ("A takes 5 minutes more than B") or several ("A takes 5 times longer than B") or both ("A takes 2 times B plus an additional 30 milliseconds ") for all input sizes.

Here is an example that uses FUNDAMENTALLY another algorithm to solve a similar problem. Firstly, a slower version, which is very similar to your original example:

 boolean arraysHaveAMatch = false; for (int i = 0; i < arr1.length(); i++) { for (int j = i; j < arr2.length(); j++) { if (arr1[i] == arr2[j]) { arraysHaveAMatch = true; } } } 

This has O (n ^ 2) behavior, just like your original (it even uses the same shortcut that you opened to run index j from index i instead of 0). Now this is a different approach:

 boolean arraysHaveAMatch = false; Set set = new HashSet<Integer>(); for (int i = 0; i < arr1.length(); i++) { set.add(arr1[i]); } for (int j = 0; j < arr2.length(); j++) { if (set.contains(arr2[j])) { arraysHaveAMatch = true; } } 

Now, if you try to run them, you will probably find that the first version launches FASTER. At least if you try to use arrays of length 10. Because the second version must deal with the creation of the HashSet object and all its internal data structures and because it must calculate the hash code for each integer. HOWEVER, if you try it with arrays of 10,000,000 in length, you will find a TOTALLY different story. The first version should examine about 50,000,000,000,000 pairs of numbers (about (N * N) / 2); the second version should perform hash function calculations on about 20,000,000 numbers (about 2 * N). In this case, you definitely need the second version!

The main idea of ​​Big O computing is that (1) it is easy to calculate (you don’t need to worry about details like the speed of your processor or what kind of L2 cache it has), and (2) who cares about small problems ... they are fast enough anyway: these are BIG problems that will kill you! This is not always the case (sometimes it matters what your cache is, and sometimes it matters how well everything works on small datasets), but they are close enough to the truth, often enough, so Big O is useful.

+1
source

One way of thinking about Big O is to consider how well the various algorithms will perform even in really unfair circumstances. For example, if someone worked on a really powerful supercomputer, and another worked on a wristwatch. If you can choose N so large that although the worst algorithm works on a supercomputer, the wristwatch can still be finished, and then they have different Big O difficulties. If, on the other hand, you can see that the supercomputer will always win, regardless of of which algorithm you have chosen or how big your N is, then both algorithms should, by definition, have the same complexity.

In your algorithms, the faster algorithm was only twice as fast as the first. This is not enough for a watch to defeat a supercomputer, even if N was very high, 1 million, 1 trillion or even a Graham number, a pocket watch could never defeat a supercomputer with this algorithm. The same would be true if they replaced the algorithms. Therefore, both algorithms, by definition of Big O, have the same complexity.

0
source

You are right that both of them are large-O n squared, and you really proved that this is true in your question when you said: "Now this algorithm is twice as fast as the original." Two times faster, multiplied by 1/2, which is constant, so by definition they are in the same set of large O.

0
source

Suppose I had an algorithm that did the same in O (n) time. Now suppose I gave you an array of 10,000 characters. Your algorithms will take n ^ 2 and (1/2) n ^ 2 times, which is 100,000,000 and 50,000,000. My algorithm will take 10,000. Obviously, the 1/2 coefficient does not affect me, since mine is much faster. It is said that the term n ^ 2 dominates the smaller terms, such as n and 1/2, which makes them negligible.

0
source

The big okh signs express a family of functions, so let's say that “this thing O (n²)” means nothing

This is not pedantry; it is the only, correct way to understand these things.

O (f) = {g | there exists x_0 and c such that for all x> x_0 g (x) <= f (x) * c}

Now suppose you consider the steps that your algorithm, in the worst case , takes depending on the size of the input: call this function f. If f \ in O (n²), you can say that your algorithm has the worst case O (n²) (but also O (n³) or O (2 ^ n)). The unconsciousness of constants follows from the definition (see, What c?).

0
source

The best way to understand Big-O notation is to get a mathematical understanding of the idea behind notation. Search for the vocabulary meaning of Asymptot

 A line which approaches nearer to some curve than assignable distance, but, though infinitely extended, would never meet it. 

This determines the maximum execution time (imaginary, because the line of asymptotes meets the curve at infinity), so what you do will be at this time. With this idea, you might want to know the notation Big-O, Small-O and omega.

0
source

Always remember, the Big O notation is a "worst case" scenario. In your example, the first algorithm has the average case of a complete outer loop * a complete inner loop, so it is n ^ 2, of course. Since the second case has one instance, where it represents an almost complete inner loop * a complete inner loop, it should be concentrated in the same heap n ^ 2, since this is the worst case. From there, it only improves, and your average compared to the first function is much lower. Despite this, as n grows, the time of your functions grows exponentially, and all this suggests that Big O really tells you. Exponential curves can vary widely, but at the end of the day they are all of the same type.

-1
source

All Articles