Will many parameters in a recursive function affect performance?

My code will pass in a binary tree in a recursive manner. I have some options that I need to control. So my function looks like this:

FindPoints(int leftchild, int rightchild, int ly_index, int uy_index, int bit, int nodepos, int amount, int level); 

This is called many times. Will the performance of my program depend on the number of parameters?

+5
source share
4 answers

Process during recursion:

  • Allocate space for parameters on the stack. Usually adding a value to the stack pointer register.
  • Copy the values โ€‹โ€‹of the variables onto the stack. Depends on objects or value.
  • Call function. This may cause the processor to crash the instruction cache.
  • At the end of the function, the stack pointer is restored by subtracting the value.
  • Return from function call; may crash the command cache.

The common problem is not performance, but recursion depth and stack size. A recursion that goes beyond the limitations of the stack is called a stack overflow defect.

An iterative solution can be faster because the compiler can optimize the loop. Optimizing recursive calls is more difficult for the compiler optimizer.

By the way, on modern processors, the worst recursive call time is less than 1 millisecond, usually around nanosecond resolution. Thus, you are trying to squeeze nanoseconds from the program. Not very good return on investment (ROI).

+7
source

Performance depends on many factors; ideally, you would try one way, try another and compare. However, here are some general considerations that may help you understand what is going on:

  • If your function works a lot, the time spent on function calls will not be significant.
  • If your function basically passes parameters unchanged, you should definitely consider refactoring your code. If it calls itself with all (or almost all) parameters having different values, you cannot improve your code - it is too complicated.
  • The performance of a function call depends on the calling convention. Compilers usually pass the first few parameters to registers (very fast), and the rest on the stack (slower). You might want to make the number of parameters small (2 for fastcall ; 4 for ARM - only two examples that I know) so that they all fit into registers.

To expand the second point - if your function does not change most of its parameters, each call will copy these parameters around the stack - this is absolutely useless work for the computer. In addition to wasting time, it also spends space in the data cache (which leads to a slower slowdown, which is especially unpleasant because it cannot even be attributed to any specific code) and can lead to a stack overflow (or maybe not , depending on your OS).

One way to improve your code in this situation is to use a struct that contains all the immutable parameters and pass a pointer / link to it:

 struct DataForFindPoints { int ly_index; int uy_index; int bit; int nodepos; int amount; int level; }; FindPoints(int leftchild, int rightchild, const DataForFindPoints& data); 

Or (object-oriented path): create a class that has FindPoints as a member function, and all immutable parameters as fields.

+5
source

Short answer

On Windows, compiling with Visual Studio 2010 in release mode, oriented to the x64 platform, passing expanded arguments is much slower than passing a single structure by reference or even by value.

Results:

 Multi result = 0; multi iterations = 10000 Ref result = 0; ref iterations = 10000 Value result = 0; value iterations = 10000 --------------------------------------------------- Timer "multi args": Total time = 0.387886 ------------------------------------------ Timer "struct by reference": Total time = 0.0679177 ------------------------------------------ Timer "struct by value": Total time = 0.143382 

Observation

The more your function performs the calculations in its body, the less the overhead of copying harms the work. In fact, I compared a function that only performs a few add-ons and one unit.

Some details now

I defined a structure containing all your parameters

 struct Args{ int leftchild; int rightchild; int ly_index; int uy_index; int bit; int nodepos; int amount; int level; Args(int l, int r, int ly, int uy, int b, int n, int a, int lev) : leftchild(l) , rightchild(r) , ly_index(ly) , uy_index(uy) , bit(b) , nodepos(n) , amount(a) , level(lev) {} }; 

and 3 functions.

 static size_t counter1 = 0; static size_t counter2 = 0; static size_t counter3 = 0; int FindPoints(int leftchild, int rightchild, int ly_index, int uy_index, int bit, int nodepos, int amount, int level) { ++counter1; leftchild = leftchild + (rightchild + ly_index + uy_index + bit + nodepos + amount + level) / 100 - 1; return leftchild ? FindPoints( leftchild, rightchild, ly_index, uy_index, bit, nodepos, amount, level) : 0; } int FindPointsRef( Args& a ) { ++counter2; a.leftchild = a.leftchild + (a.rightchild + a.ly_index + a.uy_index + a.bit + a.nodepos + a.amount + a.level) / 100 - 1; return a.leftchild ? FindPointsRef( a ) : 0; } int FindPointsValue( Args a ) { ++counter3; a.leftchild = a.leftchild + (a.rightchild + a.ly_index + a.uy_index + a.bit + a.nodepos + a.amount + a.level) / 100 - 1; return a.leftchild ? FindPointsValue( a ) : 0; } 

They all perform the same task, but first take the arguments , as in your question , the second takes the structure of the arguments by reference , and the third takes a struct by value .

I built the program using Visual Studio 2010, released the x64 configuration, and I measured it using the home class, which simply terminates the Windows QueryPerformanceCounter function and provides a convenient output statement.

The main function looks like this:

 int main() { // define my timers PersistentTimer timer_multi("multi args"); PersistentTimer timer_ref("struct by reference"); PersistentTimer timer_value("struct by value"); int leftchild = 10000; // number of iterations; 10000 to prevent stack overflow int rightchild = 1; // sum of other values is < 100 (look to FindPoints* implementations) int ly_index = 2; int uy_index = 3; int bit = 4; int nodepos = 5; int amount = 6; int level = 7; // define structs of arguments for second and third function Args args_ref( leftchild, rightchild, ly_index, uy_index, bit, nodepos, amount, level ); Args args_copy( leftchild, rightchild, ly_index, uy_index, bit, nodepos, amount, level ); // return values initialized to a non zero value just to be sure that functions have done thir job int a1 = 5; timer_multi.measure([&]{ a1 = FindPoints( leftchild, rightchild, ly_index, uy_index, bit, nodepos, amount, level ); }); std::cout << "Multi result = " << a1 << "; multi iterations = " << counter1 << '\n'; int a2 = 5; timer_ref.measure([&]{ a2 = FindPointsRef( args_ref ); }); std::cout << "Ref result = " << a2 << "; ref iterations = " << counter2 << '\n'; int a3 = 5; timer_value.measure([&]{ a3 = FindPointsValue( args_copy ); }); std::cout << "Value result = " << a3 << "; value iterations = " << counter3 << '\n'; // print timer results std::cout << timer_multi << timer_ref << timer_value; getchar(); } 
+4
source

It does not significantly affect performance. It does not matter. If you need maximum performance, you should do iteratively.

But this is dirty code. You should try to encapsulate arguments in structs or class. It is safer and easier to maintain.

+3
source

Source: https://habr.com/ru/post/1214034/


All Articles