Which loop configuration will take longer?

Code I:

for(i=0; i<100; i++){ for(j=0; j<1000; j++){ x = y; } } 

Code II:

 for(i=0; i<1000; i++){ for(j=0; j<100; j++){ x = y; } } 

Can you explain why one of these loop configurations will take longer to start than the other?

+6
c
source share
5 answers

It really depends on many factors from your direct control.

As user David V says in the comments, both will simply be fixed by a good compiler. Then, if they don’t, they will switch to some machine code with branch instructions. When the processor runs the code with branching, it uses the so-called prediction of the speculative branch, which behaves differently depending on what exact instructions the code converts. Other factors may be helpful, such as skipping the code cache. You cannot say until you carefully measure and carefully analyze the results.

+3
source share

I can point out that any good compiler, but not as good as David mentioned above, will compile this with various CPU instructions and will not need branching or any of this branch prediction logic to help avoid the kiosk pipeline.

In fact, there is a trivial CPU level design (loop instruction) that will do this using minimal software emulation. Thus, multiplication is commutative, therefore 100x1000 or 1000x100 are the same.

+1
source share

Although all the answers are generally correct, in my opinion. Namely, it would be optimized, and it would depend on machine code, etc. I think that in the simplest case, without assuming any optimization and non-speculative branching (which may be unrealistic), Code 1 will be faster, because this is a certain amount of overhead when setting up loops. Namely, you must declare the variables I and J. Since the overhead of the outer loop always happens only once, the inner loop is a real factor here. Since in code 1 the inner loop is configured only 100 times, and in code 2 the inner loop is set 1000 times, code 1 should be faster. Again, this is in the simplest case, which is probably due to an interview question or a quiz question.

+1
source share

A good answer, probably: both of them are inefficient ways of finding something in a two-dimensional array, and you should try some sort of indexing to remove it.

That was an interview question, right? Well, expect an answer to the interview :)

0
source share

In general, the inner loop has a good chance of completely entering the cache, so 100 out-1000 in should be faster. But compilers are so smart ...

0
source share

All Articles