What you need is an algebraic descriptor of what the operation codes do, and a set of algebraic laws that allow you to define equivalent operations.
Then for each instruction you will find its algebraic description (for example,
XOR eax,mem[ecx]
whose algebraic equivalent is <
eax exclusive_or mem[ecx]
list algebraic equivalences using these algebra equivalents, such as:
a exclusive_or b ==> (a and not b) or (b and not a)
to generate an equivalent algebraic expression for your XOR instruction
eax exclusive_or mem[ecx] ==> (eax and not mem[ecx]) or (mem[ecx] and not eax)
You can apply additional algebraic laws to this, for example, Morgan's theorem:
a or b ==> not (not a and not b)
To obtain
(not (not (eax and not mem[ecx])) and (not (mem[ecx] and not eax)))
At this point, you have a specification for algebraic computing that will do the same as the original. There is your brute force.
Now you need to βcompileβ this for machine instructions, matching which instructions will do what it says. Like any compiler, you probably want to optimize the generated code (it makes no sense to extract mem [ecx] twice). (All this is complicated ... its code generator!) The resulting code sequence will look something like this:
mov ebx, mem[ecx] mov edx, ebx not edx and edx, eax not eax and eax, ebx not eax or eax, edx
These are many mechanisms for manual assembly.
Another way to do this is to use a program conversion system that allows you to apply source transformations to a code source. Then you can code "equivalence" as you rewrite it directly in the code.
One of these tools is our DMS Software Reengineering Toolkit .
DMS accepts the definition of langauge (essentially, like EBNF), automatically implements the analyzer, AST constructor, and prettyprinter (anti-parser, turning the AST back into actual source text). [DMS does not currently have EBNF for ASM86, but dozens of EBNFs for various DMSs have created complex langauges, including several for different non-x86 assemblers. Therefore, you will need to define ASM86 EBNF for DMS. It is pretty simple; DMS has a really strong parser].
Using this, DMS will allow you to write source transformations directly in code. You can write the following transformations that directly implement the XOR equivalent and DeMorgan law:
domain ASM86; rule obfuscate_XOR(r: register, m: memory_access):instruction:instruction = " XOR \r, \m " rewrites to " MOV \free_register\(\),\m NOT \free_register\(\) AND \free_register\(\),\r NOT \r AND \r,\m OR \r,\free_register\(\)"; rule obfuscate_OR(r1: register, r2: register):instruction:instruction = " OR \r1, \r2" rewrites to " MOV \free_register\(\),\r1 NOT \free_register\(\) AND \free_register\(\),\r2 NOT \r2 AND \r1,\r2 NOT \r1";
with some extra magic in a meta-procedure called "free_register" that determines which registers are free at that point (AST matches) in the code. (If you do not want to do this, use the top of the stack as temporary everywhere, as in your example).
You will need a bunch of rewrites to cover all the cases you want to confuse, with their combinatorics with registers and memory operands.
Then, the transformation engine may be asked to apply these transformations at random one or more times at each point in the code to scramble it.
You can see a fully processed example of some algebraic transformations applied to DMS.