What do I need to do to get a function called in optimized compiler code?

I am working on an ios project and with Apple LLVM 4.0 with optimization. I implemented two different versions of the function: one in C and one in NEON. I wanted to test their work against each other. My idea was to name them the same number of times, and then search them in Time Profiler to see the relative time spent in each. My code initially looked like

used_value = score_squareNEON(patch, image, current_pos); used_value = score_squareC(patch, image, current_pos); 

When I was profiling the time, when the NEON code was not displayed at all. Then i tried

 for(int i = 0; i < successively_bigger_numbers; i++) { used_value = score_squareNEON(patch, image, current_pos); { used_value = score_squareC(patch, image, current_pos); 

There is still no contribution from the NEON code. Next was

 used_value = score_squareNEON(patch, image, current_pos); test = score_squareC(patch, image, current_pos); 

where the test has never been read. Nothing. Then

 test = score_squareNEON(patch, image, current_pos); test = 0; other_used_variable += test; used_value = score_squareC(patch, image, current_pos); 

Still nothing. As a result, he finally performed both functions:

 value = score_squareNEON(patch, image, current_pos); test = score_squareC(patch, image, current_pos); ... min = (value+test)/2; //before it was min=value; 

Also very important. These functions were defined in the same file in which I called them. When I tried to move function declarations to another file, both of them are called in each example.

First off, I really respect compilers. Secondly, what exactly do I need to do to make sure that the function is called? It made me start to doubt all the things that I had timed before. What if in a normal sample

 timerStart(); functionCall(); timerEnd(); 

function in the middle is fully optimized? Do I need to start checking this somehow every time, or is there a trick I can use? What are the rules that determine when a compiler can optimize an entire function call?

+4
source share
2 answers

Also very important. These functions were defined in the same file in which I called them. When I tried to move function declarations to another file, both of them are called in each example.

When the compiler can prove that a function call has no side effect and its result is not used, it can delete the call. If this cannot be proved, the call cannot be deleted, because, as far as the compiler can tell, the function can have side effects, and they should not be eliminated.

A variable declaration for calling a function is assigned to number 1 to force the compiler to leave the function call in the program (6.7.3, clause 7 in N1570):

An object that has a mutable type can be modified in ways unknown to the implementation or have other unknown side effects. Therefore, any expression that refers to such an object is evaluated strictly in accordance with the rules of an abstract machine, as described in 5.1.2.3. In addition, at each point in the sequence, the last value stored in the object agrees with what is prescribed by the abstract machine, except for changing the unknown factors mentioned earlier. What constitutes access to an object that has a modified type of modified execution.

For C ++, guarantees are a little less unambiguous as far as I can tell, but I think 1.9 should take precedence:

Program execution, 1.9 (6) and (7):

The observed behavior of an abstract machine is its sequence of reading and writing to volatile data and calls to library I / O functions. 6)

Access to the object indicated by the variable value lvalue (3.10), changing the object, calling the I / O library, or calling a function that makes any of these operations all side effects that are changes to the state of the runtime. Evaluation of expression can lead to side effects. At certain points indicated in the sequence of execution, called sequence points, all side effects of previous evaluations should be complete, and there should be no side effects of subsequent evaluations.

And in 7.1.5.1:

[Note: volatile is an implementation hint to avoid aggressive optimization involving the object because the value of the object can be changed using tools that are not detected by the implementation. See 1.9 for detailed semantics. In general, the semantics of volatiles should be the same in C ++, since they are in C.]

¹ This does not work with void fun() , of course.

+5
source

The compiler can do whatever it wants with your code, as long as the “observable” results are indistinguishable from running the code just like you wrote it on an idealized “virtual machine” defined by the language.

'observable' does not include such things as runtime, profiling results, variables observed through the debugger, etc. Observed behavior is considered to be access to volatile objects, data written to files, and processing of input and output devices.

So, to make sure your code is actually running, you need to make sure that it must be running in order to ensure the correct observable behavior. Typically, you can just save the output that you want to print or write to a file (outside the code that you use). Another option is to write the output to a mutable variable.

Another thing that can make a difference is that the compiler can statically evaluate your code, and even if you print the output, the function call can be reduced to the ease of loading the statically calculated compiler output. To avoid this, you may have to enter a function that cannot be known statically, for example, data read from input or a file or a variable variable.


Of course, using input and output in ways that you could not have in a real program can affect time. Thus, the most reliable way to measure performance is to do it in a real program with exactly the configuration you want to test. Write your program so that you can easily switch between configurations and then check both.

+4
source

All Articles