Resources and decompilation theory

There should be a million books and articles on the theory and methods of building compilers. Are there any resources to do the opposite? I am not interested in any particular HW platform. Look for good books / research papers that explore the subject and difficulties in depth.

+4
source share
3 answers

I worked on the AS3 and Java decompiler, and I can assure you that everything that I learned about decompilation is directly from the theory of compilers. Intermediate views, data flow analysis, rewriting of terms and other related concepts can be found in the book of dragons.

+2
source

Decompilation is actually wrong. Decompilers compile object code into a source representation. In many ways, they are easier to write than traditional compilers - the "source" code is already checked by syntax and is usually very accurately formatted.

They create a table of characters (addresses) and create a representation of the target application language. A common difficulty is that the source compiler optimized the source application to a greater or lesser extent by removing common subexpressions, pushing persistent code out of loops, and many other similar methods. It is often impossible to imagine in the target language.

In cases where the source is for a well-defined virtual machine, often this optimization remains in the JIT compiler, and the resulting decompiled code is very readable - in many cases it is almost identical to the original. Compilers of this type often leave some or all of the characters in the object code, allowing them to be recovered. Others include line numbers that help you debug and troubleshoot. All this helps to restore the source code.

As a counter, there are code obfuscators that intentionally perform conversions to code that prevent simple recovery of the original source by scrambling the names, changing the sequence code is generated (without changing its resulting value) and introducing constructs for which there is no equivalent of the source language.

+1
source

I wrote about decompilers for dynamic languages here .

Please note that this applies to dynamic languages ​​with user (high-level) virtual machines.

0
source

All Articles