Why empty function call in python is about 15% slower for dynamically compiled python code

This is pretty bad micro-optimization, but I'm just curious. Usually this does not affect the "real" world.

So, I compile a function (which does nothing) using compile() , and then call exec in this code and get a link to the compiled function. Then I do it a couple of million times and its time. Then repeat it using the local function. Why is a dynamically compiled function about 15% slower (on python 2.7.2) just for calling?

 import datetime def getCompiledFunc(): cc = compile("def aa():pass", '<string>', 'exec') dd = {} exec cc in dd return dd.get('aa') compiledFunc = getCompiledFunc() def localFunc():pass def testCall(f): st = datetime.datetime.now() for x in xrange(10000000): f() et = datetime.datetime.now() return (et-st).total_seconds() for x in xrange(10): lt = testCall(localFunc) ct = testCall(compiledFunc) print "%s %s %s%% slower" % (lt, ct, int(100.0*(ct-lt)/lt)) 

The output I get looks something like this:

 1.139 1.319 15% slower 
+7
source share
1 answer

The dis.dis () function shows that the code object for each version is identical:

 aa 1 0 LOAD_CONST 0 (None) 3 RETURN_VALUE localFunc 10 0 LOAD_CONST 0 (None) 3 RETURN_VALUE 

So the difference is in the function object. I compared each of the fields (func_doc, func_closure, etc.), and the other is func_globals. In other words, localFunc.func_globals != compiledFunc.func_globals .

There is a cost to providing your own dictionary instead of the built-in global variables (the first should be viewed when a stack frame is created on each call, and on the last you can refer directly to C code that already knows about the standard built-in global dictionary).

This is easy to verify by changing the exec line in your code to:

 exec cc in globals(), dd 

With which changes, the time difference goes away.

The mystery is solved!

+11
source

All Articles