What do zeros mean in python function function bytecode?

I'm trying to learn how python bytecode works, so I can do something using function code (just for fun, not for real use), so I started with some simple examples, such as:

def f(x): return x + 3/x 

Bytecode - *:

 (124, 0, 0, 100, 1, 0, 124, 0, 0, 20, 23, 83) 

Therefore, it makes sense to me that 124 is the LOAD_FAST , and the name of the loaded object is f.__code__.co_varnames[0] where 0 is the number after 124 . And 100 indicates LOAD_CONST to load f.__code__.co_consts[1] where 1 is the number after 100 . But then there are a bunch of auxiliary zeros, such as second and third and fifth zeros, which seem to have no purpose, at least for me. What do they indicate?

Text Byte Code:

 >>> dis.dis(f) 2 0 LOAD_FAST 0 (x) 3 LOAD_CONST 1 (3) 6 LOAD_FAST 0 (x) 9 BINARY_DIVIDE 10 BINARY_ADD 11 RETURN_VALUE 

* Note. In Python 3 (where bytecodes may differ from the previous one), the bytecode can be found via:

 >>> list(f.__code__.co_code) [124, 0, 100, 1, 124, 0, 27, 0, 23, 0, 83, 0] 
+9
python bytecode
source share
2 answers

A large number of byte codes accept arguments (any byte code with a code number in dis.HAVE_ARGUMENT . Those that have a 2-byte argument, endian order.

You can see a definition of what Python bytes are currently in use and what they mean in the dis documentation.

With 2 bytes, you can give any bytecode an argument value between 0 and 65535, for more bytecodes than you need, you can prefix the bytecode with EXTENDED_ARG bytecode by adding 2 more bytes for a value from 0 to 4294967295. Theoretically, you can use EXTENDED_ARG several times, but the CPython interpreter uses int for the variable oparg and, thus, is limited to 4 byte values ​​for practical purposes.

In Python 3.4, the dis module provides you with Instruction instances that simplify the introspection of each bytecode and its arguments. Using this, we can go through the bytecodes that you found for your function f :

 >>> def f(x): ... return x + 3/x ... >>> f.__code__.co_varnames ('x',) >>> f.__code__.co_consts (None, 3) >>> import dis >>> instructions = dis.get_instructions(f) >>> instructions <generator object _get_instructions_bytes at 0x10be77048> >>> instruction = next(instructions) >>> instruction Instruction(opname='LOAD_FAST', opcode=124, arg=0, argval='x', argrepr='x', offset=0, starts_line=2, is_jump_target=False) 

So, the first LOAD_FAST 124 or LOAD_FAST pushes the value for the first local name on the stack; this is an argument 0 0 , little-endian is interpreted as integer 0 , an index into the locals array of code. dis populated the argval attribute, indicating that the first local name is x . In a previous session, I show how you can expose a code object to a list of names.

 >>> instruction = next(instructions) >>> instruction Instruction(opname='LOAD_CONST', opcode=100, arg=1, argval=3, argrepr='3', offset=3, starts_line=None, is_jump_target=False) 

The following statement pushes a constant onto the stack; the argument is now 1 0 or little-endian for integer 1 ; the second constant associated with the code object. The tuple f.__code__.co_consts shows that it is 3 , but the Instruction object also gives it as an argval attribute.

 >>> next(instructions) Instruction(opname='LOAD_FAST', opcode=124, arg=0, argval='x', argrepr='x', offset=6, starts_line=None, is_jump_target=False) 

Then we have another LOAD_FAST by pushing another link to the local name x onto the stack.

 >>> next(instructions) Instruction(opname='BINARY_TRUE_DIVIDE', opcode=27, arg=None, argval=None, argrepr='', offset=9, starts_line=None, is_jump_target=False) 

This is a bytecode with no argument, the dis.HAVE_ARGUMENT below is dis.HAVE_ARGUMENT . The argument is not required because this opcode takes the top two values ​​on the stack, divides them, returning the floating-point result back to the stack. Thus, the last constants x and 3 are taken, divided and the result is returned.

 >>> next(instructions) Instruction(opname='BINARY_ADD', opcode=23, arg=None, argval=None, argrepr='', offset=10, starts_line=None, is_jump_target=False) 

Another bytecode with no arguments; this adds the top two values ​​of the stack, replacing them with the result. The BINARY_TRUE_DIVIDE from BINARY_TRUE_DIVIDE is taken, and the value of x that was first pressed and the result is returned to the stack.

 >>> next(instructions) Instruction(opname='RETURN_VALUE', opcode=83, arg=None, argval=None, argrepr='', offset=11, starts_line=None, is_jump_target=False) 

The last instruction and the other are arguments. RETURN_VALUE completes the current frame, returning the top value from the stack as the result for the caller.

+12
source share

Prior to CPython 3.6, CPython bytecode arguments accept 2 bytes. Extra zeros are high bytes of arguments.

+6
source share

All Articles