BIG O In the absence of sufficient information

So let's say you have a function, X(N) , which is a complete black box. You do not know the growth rate of the function, you cannot find it, and you cannot view the source (at the moment).

Next, consider it in the context of another function

 for(int i = 0; i < N; ++i) X(N); 

The code you wrote is linear, but obviously the function X affects the growth rate of your function. If, for example, X(N) extends to for(int i = 0; i < N; ++i) , your function is quadratic.

My question is: If someone asks a question about what Big O is about your function, what is the best way to describe the growth rate of your function?

I said that I would call it linear, and the defense of my answer is this.

If you know the actual growth rate of X , you can give an accurate estimate of your code, but while you (one way or another) can get the code in the end, most functions do not come with performance statistics.

So, if you got access to the X code, you can include it in your assessment, but where do you draw the line? X probably also calls other functions, which then call other functions. I feel that outside production scenarios where you are dealing with completely separate code, if you still do not know what growth rates of black box functions are causing, you need to decide to evaluate the code that you can.

+5
source share
3 answers

If you were talking about drawing a line, I would just like to put: -

Code Complexity: - O(N*o(X))

As soon as one can judge the complexity of the function X (N), one can simply substitute in the formula.

Until then, it will be a shortened, but useful notation in general, along with satisfaction of the complexity of the cycle.

+5
source

You just can't say it. Just information is not enough. The statement that your function is linear is false if X(N) not constant.

However, you can measure the time that X(N) takes for different input sizes. Often this will give you a rough estimate of how it behaves asymptotically.

+3
source

First of all, this is a very good question. People usually forget about the important step: Identification of the main operation .

Theoretical analysis of the effectiveness of time

Time efficiency is analyzed by determining the number of repetitions of the main operation as a function of input size.

The main operation : the operation that contributes most to the running time of the algorithm.

We are trying to choose things as the main operation, which can be performed O (1) times or close to this, in fact we choose often sublinear operations. For instance:

  • Assigning a value to a variable
  • Search for the value of a specific element in an array
  • Comparing two values
  • Increment value
  • Basic arithmetic operations such as addition and multiplication

Basic arithmetic operations, for example, are obviously not O (1). See this question from computer science: https://cs.stackexchange.com/questions/1643/how-can-we-assume-that-basic-operations-on-numbers-take-constant-time

So, we approximate O (1) when deciding on the basic operation.

Let C (n) be the number of times the main operation should be performed. And let execTime be the runtime of the main operation on a particular computer.

Then we can estimate the running time of the T (n) algorithm:

 T(n) ≈ execTime * C(n) 

As you can see, the efficiency class (e.g. Big-O) depends only on C (n), since execTime will be constant (and we do not need constants in asymptotic analysis). So the challenge is to find C (n).

Analysis of your code

First, we define an input parameter, which is obviously equal to N.

Then we solve the basic operation, and this part makes the difference.

If we solve X (N); as a basic operation, then:

 T(N) ≈ execTime * C(N) C(N) = N O(execTime * N) = O(n) 

If we knew what was going on inside X (N), we could decide on a basic operation that takes place N times in every operation X (N), for example:

 T(N) ≈ execTime * C(N) C(N) = N * N O(execTime * N * N) = O(N * N) = O(n^2) 

We could also select the whole cycle as the main operation:

 T(N) ≈ execTime * C(N) C(N) = 1 O(execTime * 1) = O(1) 

So which one is correct? All of them. What is the point? It all depends on what kind of comparison you want to make.

For example, when comparing the performance of sorting algorithms, the main operations are key comparisons, since the most significant work in the sorting algorithm is to compare the keys, and all sorting algorithms perform a key comparison, so we can compare them with each other,

In conclusion, you can choose X (N) as the basic operation and say that the algorithmic complexity is O (N). X (N) does not have to be O (1) strictly, as I explained above, we already neglect the basic operations with greater complexity than O (1). What is really important here is what kind of comparison do you want to make.

Suppose you want to compare these two algorithms.

 for(int i = 0; i < N; ++i) X(N); 

and

 for(int j = 0; j < N; j++) for(int i = 0; i < N; ++i) X(N); 

Here, if you select X (N) as the base operation for both, you would compare apples with apples, you would rate O (N) against O (N ^ 2), and it would be a fine fine.

If you think this is wrong because X (N) is not O (1), consider this:

 for(int i = 0; i < N; ++i) System.out.print(N); 

An easy way to print from Java. We select it as the basic operation without thinking at all, since it is just a print operator, and suppose that O (1). But this is not so. This complexity is sublinear, but definitely not constant. The time spent on printing increases with the size of the printed item (N in this case).

So, X (N) or System.out.println (N), does it matter?

PS I did not invent the approximation method above. I prepared the answer using this book, mainly the second chapter. I recommend you take a look.

0
source

All Articles