Python is a very dynamic language and supports many different levels of introspection. Because of this, obfuscating Python bytecode is a mining challenge.
In addition, your python built-in interpreter should still be able to execute the bytecode that you send along with your product. And if the interpreter must have access to bytecode, then everyone else can, too. Encryption will not help, because you still need to decrypt the bytecode yourself, and then everyone else can read the bytecode from memory. Obfuscation only makes the default tools more complex and not impossible.
With that said, here's what you need to do to make it very hard to read your application. Python bytecode:
Reassign all python opcode values ββto a new value. Restart the entire interpreter to use different byte values ββfor different operation codes.
Remove all as many introspection features as you can leave. Your functions should have closures, and code objects still need constants, but to hell with a list of local objects in the code object, for example. Neuter sys._getframe() function, slash trace information.
Both of these steps require in-depth knowledge of how the Python interpreter works, and how the Python object model fits together. You will certainly introduce errors that are difficult to solve.
In the end, you should ask yourself if it is worth it. A certain hacker can still analyze your bytecode, perform some frequency analysis to restore the opcode table, and / or feed your program to other opcodes to see what happens and decrypt all obfuscation. When a translation table is created, decompiling your bytecode is quick, and code recovery is just around the corner.
If all you want to do is not modify the bytecode files, embed checksums for your .pyc files and check them at startup. Refuse to download if they do not match. Someone will fix your binary to remove the checksum or replace the checksums, but you donβt have to make so much effort to provide protection against token protection.
Martijn pieters
source share