This will work, except from the perspective of the caller, your function modifies sp . In the 32-bit procedure of most calls, functions are only allowed to change eax/ecx/edx and must save / restore other registers if they want to use them. I assume 16 bit is similar. (Although, of course, in asm you can write functions with any conditional calling conventions that you like.)
Some calling conventions expect the caller to call the arguments invoked by the caller, so this will actually work. Matteo's ret 4 answer does this. (See x86 wiki tags for information on calling conventions and many other good links.)
This is super-weird and not the best way to do something, so it is usually not used. The biggest problem is that it gives you access to parameters in order, not random access . You can only access the first 6 or so, because you have run out of registers to insert them.
It also binds the register containing the return address. x86 (before x86-64) has very few registers, so this is really bad. You can push the return address after, apparently, you can add another args function to the registers to free it for use.
jmp ax will technically work instead of push / ret , but it defeats the return address predictor, slowing down future ret instructions.
But in any case, creating a stack frame with push bp / mov bp, sp universally used in 16-bit code, because it is cheap and gives you random access to the stack . ( [sp +/- constant] not a valid 16-bit addressing mode (but it is in 32 and 64 bits). ( [bp +/- constant] is valid). Then you can reload them when you need to.
In 32 and 64-bit code for compilers, the addressing mode is usually used, for example [esp + 8] or something else, instead of wasting instructions and binding ebp . ( -fomit-frame-pointer by default). This means that you need to track changes in esp in order to work out the correct bias for the same data in different instructions, so it is not popular in handwritten asm, esp in textbooks / training materials. In real code, you obviously do everything that is most efficient, because if you are willing to sacrifice efficiency, you just use the C compiler.