Writing MIPS machine instructions and executing them with C

I am trying to write self-study code in C and MIPS.

Since I want to change the code later, I try to write the actual machine instructions (as opposed to the built-in assembly) and try to follow these instructions. Someone told me that one could just copy some memory, write instructions there, point to a pointer to the C function, and then go to it. (I include an example below)

I tried this with my cross-compiler (toolchain for sourcery code table) and it doesn’t work (yes, in the back light, I think it seems rather naive). How could I do it right?

#include <stdio.h> #include <stdlib.h> #include <stdint.h> void inc(){ int i = 41; uint32_t *addone = malloc(sizeof(*addone) * 2); //we malloc space for our asm function *(addone) = 0x20820001; // this is addi $v0 $a0 1, which adds one to our arg (gcc calling con) *(addone + 1) = 0x23e00000; //this is jr $ra int (*f)(int x) = addone; //our function pointer i = (*f)(i); printf("%d",i); } int main(){ inc(); exit(0);} 

I follow the gcc call convention here, where the arguments are passed in $ a0, and the function results are expected to be in $ v0. I really don’t know if the return address will be placed in $ ra (but I can’t check it, because I can’t compile it. I use int for my instructions because I compile MIPS32 (hence 32 bits of int should be enough)

+2
source share
4 answers

OP code as a written compiler without errors with Codesourcery mips-linux-gnu-gcc.

As mentioned above, self-modifying code in MIPS requires that the command cache be synchronized with the data cache after writing the code. The MIPS architecture version of MIPS32R2 added the SYNCI command , which is a user-mode instruction that does what you need here. All modern MIPS processors implement MIPS32R2, including SYNCI .

Memory protection is an option in MIPS, but most MIPS processors are not built with this feature, so using the mprotect system call is probably not required for most real MIPS hardware.

Note that if you use any optimization other than -O0 , the compiler can and optimizes stores to *addone and calling a function that breaks your code. Using the volatile keyword prevents the compiler from doing this.

The following code generates the correct MIPS assembly, but I do not have the MIPS equipment that could be tested on it:

 int inc() { volatile int i = 41; // malloc 8 x sizeof(int) to allocate 32 bytes ie one cache line, // also ensuring that the address of function addone is aligned to // a cache line. volatile int *addone = malloc(sizeof(*addone) * 8); *(addone) = 0x20820001; // this is addi $v0 $a0 1 *(addone + 1) = 0x23e00000; //this is jr $ra // use a SYNCI instruction to flush the data written above from // the D cache and to flush any stale data from the I cache asm volatile("synci 0(%0)": : "r" (addone)); volatile int (*f)(int x) = addone; //our function pointer int j = (*f)(i); return j; } int main(){ int k = 0; k = inc(); printf("%d",k); exit(0); } 
+2
source

You are using pointers incorrectly. Or, to be more precise, you are not using pointers in which you should be.

Try this for size:

 uint32_t *addone = malloc(sizeof(*addone) * 2); addone[0] = 0x20820001; // addi $v0, $a0, 1 addone[1] = 0x23e00000; // jr $ra int (*f)(int x) = addone; //our function pointer i = (*f)(i); printf("%d\n",i); 

You may also need to set the memory as executable after writing it, but before calling it:

 mprotect(addone, sizeof(int) * 2, PROT_READ | PROT_EXEC); 

To do this, you may need to allocate a significantly larger block of memory (4k or so) so that the address is aligned on the page.

+2
source

You also need to make sure that the corresponding memory is executable, and make sure that it is properly cleaned from dcache after it is written and loaded into icache before executing it. How to do this depends on the operation of the OS on your computer with mips.

On Linux, you must use the mprotect system call to make the memory executable, and the cacheflush system call to clear the cache.

change

Example:

 #include <unistd.h> #include <sys/mman.h> #include <asm/cachecontrol.h> #define PALIGN(P) ((char *)((uintptr_t)(P) & (pagesize-1))) uintptr_t pagesize; void inc(){ int i = 41; uint32_t *addone = malloc(sizeof(*addone) * 2); //we malloc space for our asm function *(addone) = 0x20820001; // this is addi $v0 $a0 1, which adds one to our arg (gcc calling con) *(addone + 1) = 0x23e00000; //this is jr $ra pagesize = sysconf(_SC_PAGESIZE); // only needs to be done once mprotect(PALIGN(addone), PALIGN(addone+1)-PALIGN(addone)+pagesize, PROT_READ | PROT_WRITE | PROT_EXEC); cacheflush(addone, 2*sizeof(*addone), ICACHE|DCACHE); int (*f)(int x) = addone; //our function pointer i = (*f)(i); printf("%d",i); } 

Please note that we make the entire page (s) containing the code, both written and executable. This is because memory protection works on the page, and we want malloc to continue to use the rest of the page for other things. Instead, you could use valloc or memalign to highlight entire pages, in which case you could safely execute code to read.

+2
source

Calling a function is much more complicated than just going to a command.

  • How are arguments passed? Are they stored in registers or placed on the call stack?

  • How is the value returned?

  • Where is the return address for the return transition? If you have a recursive function, $ra does not cut it.

  • Is the caller or caller responsible for saying the stack frame when the called function exits?

Different calling conventions have different answers to these questions. Although I have never tried anything like this for what you would do, I would suggest that you have to write your machine code according to the convention, and then tell the compiler that your function pointer uses this convention (different compilers have different ways of doing this - gcc does this with function attributes ).

0
source

All Articles