Stackoverflow: too many recursive calls? in C

I am trying to go through a huge graph (about 875,000 nodes and 5,200,000 edges), but I get stackoverflow. I have a recursive function to scroll it. It will only investigate nodes that have not been explored, so it cannot turn into infinite recursion. (or at least I think) My recursive function works for small inputs (5000 nodes).

What should I do? Is there a maximum number of successful recursive calls?

I really do not know.

EDIT: I also posted the iterative equivalent at the end.

Here is the recursion code:

int main() { int *sizeGraph,i,**reverseGraph; // some code to initialize the arrays getGgraph(1,reverseGraph,sizeGraph); // populate the arrays with the input from a file getMagicalPath(magicalPath,reverseGraph,sizeGraph); return 0; } void getMagicalPath(int *magicalPath,int **graph,int *sizeGraph) { int i; int *exploredNode; /* ------------- creation of the list of the explored nodes ------------------ */ if ((exploredNode =(int*) malloc((ARRAY_SIZE + 1) * sizeof(exploredNode[0]))) == NULL) { printf("malloc of exploredNode error\n"); return; } memset(exploredNode, 0, (ARRAY_SIZE + 1) * sizeof(exploredNode[0])); // start byt the "last" node for (i = ARRAY_SIZE; i > 0; i--) { if (exploredNode[i] == 0) runThroughGraph1stLoop(i,graph,exploredNode,magicalPath,sizeGraph); } free(exploredNode); } /* * run through from the node to each adjacent node which will run to each adjacent node etc... */ void runThroughGraph1stLoop(int node,int **graph,int *exploredNode,int *magicalPath,int *sizeGraph) { //printf("node = %d\n",node); int i = 0; exploredNode[node] = 1; for (i = 0; i < sizeGraph[node]; i++) { if (exploredNode[graph[node][i]] == 0) { runThroughGraph1stLoop(graph[node][i],graph,exploredNode,magicalPath,sizeGraph); } } magicalPath[0]++; // as index 0 is not used, we use it to remember the size of the array; quite durty i know magicalPath[magicalPath[0]] = node; } 

Iterative equivalent of the above:

 struct stack_t { int node; int curChildIndex; }; void getMagicalPathIterative(int *magicalPath,int **graph,int *sizeGraph) { int i,k,m,child,unexploredNodeChild,curStackPos = 0,*exploredNode; bool foundNode; stack_t* myStack; if ((myStack = (stack_t*) malloc((ARRAY_SIZE + 1) * sizeof(myStack[0]))) == NULL) { printf("malloc of myStack error\n"); return; } if ((exploredNode =(int*) malloc((ARRAY_SIZE + 1) * sizeof(exploredNode[0]))) == NULL) { printf("malloc of exploredNode error\n"); return; } memset(exploredNode, 0, (ARRAY_SIZE + 1) * sizeof(exploredNode[0])); for (i = ARRAY_SIZE; i > 0; i--) { if (exploredNode[i] == 0) { curStackPos = 0; myStack[curStackPos].node = i; myStack[curStackPos].curChildIndex = (sizeGraph[myStack[curStackPos].node] > 0) ? 0 : -1; while(curStackPos > -1 && myStack[curStackPos].node > 0) { exploredNode[myStack[curStackPos].node] = 1; if (myStack[curStackPos].curChildIndex == -1) { magicalPath[0]++; magicalPath[magicalPath[0]] = myStack[curStackPos].node; // as index 0 is not used, we use it to remember the size of the array myStack[curStackPos].node = 0; myStack[curStackPos].curChildIndex = 0; curStackPos--; } else { foundNode = false; for(k = 0;k < sizeGraph[myStack[curStackPos].node] && !foundNode;k++) { if (exploredNode[graph[myStack[curStackPos].node][k]] == 0) { myStack[curStackPos].curChildIndex = k; foundNode = true; } } if (!foundNode) myStack[curStackPos].curChildIndex = -1; if (myStack[curStackPos].curChildIndex > -1) { foundNode = false; child = graph[myStack[curStackPos].node][myStack[curStackPos].curChildIndex]; unexploredNodeChild = -1; if (sizeGraph[child] > 0) { // get number of adjacent nodes of the current child for(k = 0;k < sizeGraph[child] && !foundNode;k++) { if (exploredNode[graph[child][k]] == 0) { unexploredNodeChild = k; foundNode = true; } } } // push into the stack the child if not explored myStack[curStackPos + 1].node = graph[myStack[curStackPos].node][myStack[curStackPos].curChildIndex]; myStack[curStackPos + 1].curChildIndex = unexploredNodeChild; curStackPos++; } } } } } } 
+4
source share
4 answers

Usually you should not rely on too deep recursion. Different platforms handle this differently, but usually it is something like this:

 max number of recursion = stack memory / function state 

The variable stack memory very different from system to system. Some operating systems can simply use a fixed amount of main memory, others can increase the number of stacks, some can use page files and swap memory for growth and not create any restrictions. As a C programmer with an abstract C standard, you cannot rely on anything.

So, you can first optimize the state of the function (reset the variables, use smaller integers, etc.). But this cannot be a real solution.

  • Some compilers recognize tail recursion and convert recursion to iteration. But then again, you cannot rely on it (the C standard does not guarantee this, the language you can rely on will be Common LISP). See Also Does C ++ Recursion Limit Depth? as a related question.

  • Compilers may offer options for setting recursive constraints. But once again, you cannot rely on it if your depth is actually not limited by design.

But the real solution is to manually convert recursion to iteration. The easiest way is to store all the functional internal data on the stack and emulate your recursion manually:

 int fac(int x) { if (x<=1) return 1; return x*fac(x-1); } 

To (Pcode so you understand):

 int fac(int x_) { struct state_t { int x; int ret; }; // <-- all parameters and local variables would go here in the beginning struct stack_of_state_t {...}; stack_of_state_t stack; push(stack, {x_, 1}); while (1) { if (top(stack).x<=1) return top(stack).ret; push(stack, {x-1, (top(stack).x) * top(stack).ret}); } } 

Although this usually works better than recursion, it may not be the smartest solution, and you should start working on what state should really be saved.

In our example, we find that we always only need the top of the stack, so we instantly free the stack:

 int fac(int x) { int ret = 1; while (1) { if (x<=1) return ret; ret = x * ret; x = x-1; } } 

And make it even more beautiful:

 int fac(int x) { int ret = 1; while (x>1) ret *= x--; return ret; } 

This is one of the classic non-recursive factorial implementations.

So, the general recipe: Start by pushing your function state onto the stack, and then continue to refactor.

+7
source

If the function is called once per node, you will need 875,000 frames of the stack, at least 7*sizeof(int*) bytes. On a 32-bit system, a 23 MB stack is required, which is not much, but perhaps beyond certain limits.

You will need to come up with an iterative approach to go along your schedule. Basically, you need to allocate a large array (size == number of nodes) of structures, where each structure contains a "stack stack". In your case, the node and i stack frame, because everything else is just passed and not changed.

Whenever you need recursion, you store the current values ​​of node and i in a new structure and add it to the array. When the recursion ends, restore the values.

+5
source

You can convert your recursive code to use a stack data structure that you allocate to the heap yourself. This is more complex and less clean than direct recursion, but it is more reliable since it is not limited by the size of the call stack.

+1
source

There is a maximum number of recursive calls, because your computer has only a certain amount of memory (whether it be RAM or disk). Each call requires some memory to store the function data. Thus, only a certain number of calls can fit in memory. Additional operating systems and programming tools limit the size of the stack. You may be able to increase the size of this stack to use more available memory, but it remains finite.

As a rule, graph algorithms for large graphs are implemented by increasing the graph with additional information. For example, you can add a field for each node or each edge of the graph that indicates whether it has been visited in the current crawl, or you can add a field that indicates the length of the shortest path found before this node. Then you rewrite the algorithm as an iterative algorithm using this additional data, and not as a recursive algorithm using the stack. If you cannot directly increase the original graph, you can build a separate graph with additional data.

Of course, additional data usually uses space proportional to the size of your graph. If your algorithm recurs less than this, for example, to the depth of the log (graph size), you might want to find a solution that uses less space. Like others, one way to do this is to allocate memory and implement your own recursion counterpart by storing data in that memory, as your algorithm goes through different traversal depths. This is theoretically equivalent to recursion, but in practice it has two advantages: memory allocation can be simpler than increasing the allowable stack space, and gives you direct control over how much data is stored for each level, while allowing the compiler to perform recursion on the compiler responsible for storing data (and the compiler may inefficiently store various temporary values ​​from your function state that really don't need to be stored).

+1
source

All Articles