What happens under the hood when one method calls another?

Question

What happens under the hood when one method calls another?

This is similar to what happens when the program starts. but not a hoax.

Let's say I have a simple console program with two methods A and B.

public static void RunSnippet() { TestClass t = new TestClass(); tA(1, 2); tB(3, 4); } public class TestClass { public void A(int param1, int param2) { //do something C(); } private void C() { //do } public bool B(int param1, int param2) { //do something bool result = true; return result; } }

Can someone explain in detail (but please keep it in plain English) what really happens when RunSnippet calls method A and method B (and they internally call some other methods). I want to understand what is really happening under the hood ... means how the parameters are passed, where they are stored, what happens with local vars, how the return values are passed, what happens if another thread starts working when A called C, which happens if an exception is thrown.

+4

language-agnostic computer-science

Sandbox Aug 16 '09 at 20:12

source share

7 answers

A function call is essentially a goto statement, except that at the end it must return to where it was called.

There is a function call stack that essentially contains information about where to “return”.

To call a function, you need:

Save (click) the location of the current command on the stack for the function to be called, which will be used when it is done.
Paste all parameters onto the stack.
Go to the first instruction in the called function.

When the called function needs to read the parameters, it will read them from the stack.

When the called function is executed or falls into the "return" statement, it finds the address to which it needs to return and "go to it".

+4

hasen Aug 16 '09 at 20:35

source share

(assuming x86) First you need to understand the stack . Functions use a region of memory called the "stack". You can think of it as a stack of plates, where each plate contains a DWORD (32 bit) of data. The CPU has a register that keeps track of the current location on the stack (it’s just a virtual memory address) that we are dealing with. It is called a stack pointer and is usually stored in the esp register.

When functions interact with the stack, they usually perform one of two tasks: push or pop . A “push” is what it puts on top of the stack, which consists of moving the stack pointer to the next highest position and then copying something to a new location (new top). The push “increases the stack” because more data is stored there (more records).

“Pop” is when the “top” item on the stack is “deleted”, which consists of copying what is currently on top of the stack (indicated by the esp register) into the processor register (usually eax) and then moving the stack pointer one position lower on the stack.

So now we can talk about setting up a function call.

code

 tB(3, 4);

Knot

 // here is a push we described above. The function we are in currently is // pushing the value "4" onto the stack. This is one of the arguments to the // B function we are calling. Note that we push the last argument first push 4 // here is another push. This time we are pushing the next argument to the // B function push 3 call B // this call sets up the context for the next function to run

When a call occurs, we transfer the context from the current function to the called function. The information about additional elements that the function should execute is the arguments that we push onto the stack.

Now the new function will work with the storage to free up space for local variables, as well as save the stack pointer in the register so that it can be returned after the function returns. If this did not happen, then the calling function would be completely disoriented when it regains control, having no idea how to access the materials that it previously put on the stack, for example, to its own local variables or context for the stack pointer for which summoned him.

Now this is happening in the assembly (steal it from Havendar).

 // Here is the B function making sure that the calling function can get back to // the it stack context when B returns. push ebp mov ebp, esp // remember when I said that a push was growing the stack. Well you can also grow // it just by moving the stack pointer higher, as if there were already more plates there // you may wonder why we are subtracting (sub) from the stack pointer (esp) to grow it // the reason is that the stack "grows down" in memory. In other words, as the stack grows // the memory addresses of the stack grow smaller. // the reason we are subtracting 4 is because we only need to grow the stack by one plate // so that we can store the local variable 'result' there. If we had 2 local variables // we would have subtracted 8 sub esp, 4 // the instructions below are simply moving the static value 1 into the local variable // 'result'. Local variables are always referenced relative to the bottom of the stack // context for the current function. This value is stored in the ebp register, which we // saw earlier in the function setup above. // so now we think of the location where the 'result' variable is stored as "ebp-4" // we know that because we put it there. mov dword ptr [ebp-4], 1 // result = 1 (true) // eax is a special register that contains the return value of the function. That is why // you see the value of 'result' (which we know as [ebp-4] in the eax register mov eax, dword ptr [ebp-4] // We adjust the stack pointer back to it previous location // before we subtracted to make room for our local variable add esp, 4 // Our work is done now.. time to clean stuff up for our calling function and // leave things as we found them. Our trusty ebp register stores the old stack pointer // that our calling function needs to resume it stack context. mov esp, ebp pop ebp ret

I am sure there are some details that I forgot, especially when returning from function B, but this is a pretty good overview, I think.

+2

Christopher scott Aug 16 '09 at 21:35

source share

Do you mean at the assembly language level or at the OS level?

As for the assembly, what happens when the method is called is that all arguments are pushed onto the stack and, finally, the address of the method (if it is virtual, there is an additional search in the table). The code then continues from the method address until the ret command is deleted and execution resumes from where the call was made. You must learn the assembly and how to compile C to get a good grip on this process.

At the OS level, there is no particular involvement in the method call, the entire OS is the allocation of processor time to the process, and this process is responsible for what it wants during this time, whether it be method calls or something else. However, switching between threads is performed by the OS (as opposed to using program threads, for example, in CPython).

+1

Asik Aug 16 '09 at 20:28

source share

If you are interested in explaining the level of the Assembly, what is happening, I recommend watching this lecture from CS107 @ Stanford University. I found that he very well explained what the costs of function calls are very, very simple in English.

http://www.youtube.com/watch?v=FvpxXmEG1F8&feature=PlayList&p=9D558D49CA734A02&index=9

+1

Ian bishop Aug 16 '09 at 20:50

source share

TestClass t = new A ();

I think you mean the new TestClass () here.

As for what happens under the hood, the compiler will convert this code to Java bytecode. Here is an excerpt from an article on how the Java virtual machine handles calling and returning a method.

When the Java virtual machine calls a class method, it selects a method to call based on the type of the object reference, which is always known at compile time. On the other hand, when the virtual machine calls the instance method, it selects the method to call based on the actual class of the object, which can only be known at run time.
The JVM uses the two different instructions shown in the table below to call these two different kinds of methods: invokevirtual for instance methods and invokestatic for a class.
Invokevirtual and invokestatic method call Opcode operands Description
invokevirtual indexbyte1, indexbyte2 pop objectref and args, invocation method with constant pool index
invokestatic indexbyte1, indexbyte2 pop args, call the static method when the constant pool index

0

Pierre-Antoine LaFayette Aug 16 '09 at 20:29

source share

What tB(3, 4) does:

 push 4 push 3 call B add esp, 8 // release memory used

call pushes the instruction address immediately after the call onto the stack, then transfers the process thread to address B ():

 push ebp // save EBP state, the caller will need it later mov ebp, esp // save ESP state // push registers I would use but EAX, I'm not using any sub esp, 4 // alloc 4 bytes in the stack to store "result" mov dword ptr [ebp-4], 1 // result = 1 (true) mov eax, dword ptr [ebp-4] // prepares return value o be "result" add esp, 4 // frees allocked space // pop registers mov esp, ebp pop ebp ret

Sharing objects. When you declare a new object, all stored ones are object variables. In this case, no links are provided.

About multiple threads, streaming memory is shared. Nothing happens in a thread thread when the kernel switches the processor to another thread. The kernel simply freezes and resumes this thread.

0

Havenard Aug 16 '09 at 20:41

source share

Eric J. · Accepted Answer · 2009-08-16T20:25:18+0000

I'm not quite sure what level of detail you're looking for, but here is my blow to explaining what is going on:

A new process is created for your executable file. This process has a stack segment containing each stream stack, a data segment for static variables, and a block of memory called a heap for dynamically allocated memory, and a code segment containing compiled code.
Your code is loaded into the code segment, the instruction pointer is set to the first command in your main () method, and the code starts to run.
Object t is allocated from the heap. Address t is stored on the stack (each thread has its own stack).
tA () is called by placing the return address in main () on the stack and changing the command pointer to the beginning of the tA () code. The return address is pushed onto the stack along with the values 1 and 2.
tA () calls tC (), placing the return address on tA () on the stack and changing the instruction pointer to the starting address of the tC () code.
tC () returns by placing the return address in tA () from the stack and setting a pointer to this value.
tA () returns similarly to tC ().
Calling tB () is very similar to calling tA (), except that it returns a value. The exact mechanism for returning this value depends on the language and platform. Often the value will be returned in the CPU register.

Note. Since your methods are very small, modern compilers often “build” them instead of the classic call. Nesting means taking the code from the methods and injecting them directly into the main () method, rather than passing (insignificant) overhead to the function call.

Given your example, I don’t see how streaming can directly go into the image. If you want to run the executable a second time, it will be launched in a new process. This means that it will receive its own code segment, data segment and stack segment, completely isolating it from the first process.

If your code was run in a larger program called main () for multiple threads, it will work almost the same way as described above. The code is thread safe because it does not have access to potentially shared resources, such as static variables. There is no way that Thread 1 can “see” Thread 2 because all key data (values and pointers to objects) is stored on the local thread stack.

What happens under the hood when one method calls another?

More articles: