(This suggests that Adam Rosenfield's decision is not applicable. This or a similar approach is probably the best way to solve it.)
You did not specify how you emulate the% gs register, but it will probably be difficult to plan each use at all if you do not have special knowledge of the program, because otherwise you will only have 2 bytes (in the worst, general case), which you can change with the patch. Of course, if you use something like% es =% gs, it should be relatively straightforward.
Assuming that this can somehow be made to work in your case, the strategy is to check the executable sections of the ELF file and correct any instructions that use or change the GS register. These are at least the following instructions:
- Any command with a GS segment override prefix (
65 expects branch commands, in this case the prefix indicates something else) push gs ( 0F A8 )pop gs ( 0F A9 )mov r/m16, gs ( 8C /r )mov gs, r/m16 ( 8E /r )mov gs, r/m64 ( REX.W 8E /r ) (If you support 64-bit mode)
And any other instructions that segment registers allow (I don’t think it is much more, but I’m not 100% sure).
All of this comes from the Intel® 64 and IA-32 Software Developer's Guide. Combined Volumes 2A and 2B: A Guide to a Set of Instructions, AZ . Keep in mind that instructions are sometimes prefixed with other prefixes, sometimes not, so you should probably use library to execute decoding of commands, rather than blind searching for byte sequences.
Some of the instructions above should be relatively straightforward to turn into call my_patch or similar, but you will probably have a problem finding what works in two bytes and works in general. int XX ( CD XX ) can be a good candidate if you can customize the interrupt vector, but I'm not sure if it will be faster than the method you are currently using. Of course, you will need to write down what instruction has been fixed, and the interrupt handler (or something else) reacts differently depending on the return address (which your handler receives).
Perhaps you can set up a trampoline if you can find a place within -128..127 bytes and use JMP rel8 ( EB cb ) to jump to the springboard (usually a different JMP , but this time with a lot of number for the destination address), which then processes the command emulation and returns to the instruction after the correct use of% gs.
In the end, I would recommend keeping the trap and emulation code in order to catch any cases that you might not have thought of (for example, self-modified or entered code). That way, you can also log any unprocessed cases and add them to your solution.