How to load compiled python modules from memory?

I need to read all modules (precompiled) from a zipfile (created by py2exe compressed) into memory, and then load them all. I know that this can be done by downloading directly from a zip file, but I need to load them from memory. Any ideas? (I am using python 2.5.2 on windows) TIA Steve

+6
python module
source share
2 answers

It depends on what exactly you have as a "module (precompiled)". Suppose this is exactly the contents of a .pyc file, such as ciao.pyc , as built:

 $ cat>'ciao.py' def ciao(): return 'Ciao!' $ python -c'import ciao; print ciao.ciao()' Ciao! 

IOW, by building ciao.pyc , let's say what you are doing now:

 $ python Python 2.5.1 (r251:54863, Feb 6 2009, 19:02:12) [GCC 4.0.1 (Apple Inc. build 5465)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> b = open('ciao.pyc', 'rb').read() >>> len(b) 200 

and your goal is to go from that line of byte b to the imported ciao module. Here's how:

 >>> import marshal >>> c = marshal.loads(b[8:]) >>> c <code object <module> at 0x65188, file "ciao.py", line 1> 

this way you get the code object from the binary contents of .pyc . Edit : if you're interested, the first 8 bytes — the ā€œmagic numberā€ and the timestamp — are not needed here (unless you want them to check them and make exceptions if warranted, but that is beyond the scope of the question, marshal.loads will grow anyway if it detects a damaged string).

Then:

 >>> import types >>> m = types.ModuleType('ciao') >>> import sys >>> sys.modules['ciao'] = m >>> exec c in m.__dict__ 

ie: create a new module object, set it in sys.modules , populate it by executing the code object in __dict__ . Edit : the order in which you insert sys.modules and exec matters if and only if you can have circular import, but this is the usual order used by Python import so it’s better to mimic it (which has no specific flaws).

You can ā€œcreate a new module objectā€ in several ways (for example, from functions in standard library modules such as new and imp ), but ā€œcall the type to get the instanceā€ is the usual Python way these days, and the normal place to get the type is ( if it doesn’t have a built-in name or you don’t have it already) from the standard types library module, so I recommend it.

Now finally:

 >>> import ciao >>> ciao.ciao() 'Ciao!' >>> 

... you can import a module and use its functions, classes, etc. Other import (and from ) statements will then find the module as sys.modules['ciao'] , so you won’t need to repeat this sequence of operations (really, you don’t need this last import statement here if all you want is the module to be accessible for import from other sources - I add it only to show that it works ;-).

Change If you absolutely need to import packages and modules from them in this way, rather than ā€œsimple modulesā€, as I just showed, this is also possible, but a little more complicated. Since this answer is already quite long, and I hope you can simplify your life by sticking to simple modules for this purpose, I am going to dodge this part of the answer; -).

Also note that this may or may not do what you want, in cases of ā€œloading the same module from memory several timesā€ (each time it rebuilds the module, you can check sys.modules and just skip everything if the module already exists), and in particular, when such a repeated "loading from memory" occurs from several threads (requiring locks), but the best architecture should have a separate dedicated thread dedicated to the task, with other modules communicating with it through the Queue).

Finally, there is no discussion of how to set this functionality as a transparent ā€œimport hookā€, which is automatically included in the mechanisms of the internal elements of the import statement itself - it’s also possible, but not quite what you are asking about it, so here I hope that you simplify your life by doing the same in a simple way, as this answer says.

+30
source share

A compiled Python file consists of

  • magic number (4 bytes) to determine the type and version of Python,
  • timestamp (4 bytes) to check if we have a newer source,
  • marshaled code object.

To load a module, you need to create a module object using imp.new_module() , execute unscaled code in the new module namespace and place it in sys.modules . The following is an example implementation:

 import sys, imp, marshal def load_compiled_from_memory(name, filename, data, ispackage=False): if data[:4]!=imp.get_magic(): raise ImportError('Bad magic number in %s' % filename) # Ignore timestamp in data[4:8] code = marshal.loads(data[8:]) imp.acquire_lock() # Required in threaded applications try: mod = imp.new_module(name) sys.modules[name] = mod # To handle circular and submodule imports # it should come before exec. try: mod.__file__ = filename # Is not so important. # For package you have to set mod.__path__ here. # Here I handle simple cases only. if ispackage: mod.__path__ = [name.replace('.', '/')] exec code in mod.__dict__ except: del sys.modules[name] raise finally: imp.release_lock() return mod 

Refresh : The code is updated to process packages correctly.

Please note that you need to set import capture to import imported inside loaded modules. One way to do this is to add your crawler to sys.meta_path . See PEP302 for details .

+9
source share

All Articles