Many / most instruction sets have a relative pc address, which means the address of the program address, which is associated with the address of the executable command, and then adds an offset to it and uses it to access memory or branch or something like that. it will be what you call roaming. Because no matter where this instruction is in the address space, the thing you want to move is relative. Move the entire block of code and data to some other address, and they will still be relatively the same distance from each other, so relative addressing will still work. If equal to skip, the following instruction works wherever these three instructions are (skip, skipped and one after skip).
Absolute uses absolute addresses, jumps to this exact address, reads from this exact address. If the value is equal, then go to 0x1000.
Assembler does not do this, compiler and / or programmer. As a rule, in the end, the compiled code will have an absolute address, in particular if your code consists of separate objects that are connected to each other. At compile time, the compiler does not know where the object will be located, and cannot know where the external links are or how far, therefore, it cannot assume that they will be close enough for pc relative addressing (which usually has a range limit), therefore compilers often create a placeholder for the linker that populates the absolute address. It depends on the set of operations and commands and some other factors associated with solving this external address problem. In the end, although based on the size of the project, the linker will ultimately have some absolute addressing. Thus, non-default is usually a command line parameter for creating position-independent -PIC code, for example, it may be your compiler. both the compiler and the linker then have to do extra work to make these positions independent. The assembler programmer must do this on his own, the assembler does not participate at all in this, he simply creates machine code for the instructions that you tell him.
novectors.s:
.globl _start _start: b reset reset: mov sp,#0xD8000000 bl notmain ldr r0,=notmain blx r0 hang: b hang .globl dummy dummy: bx lr
hello.c
extern void dummy ( unsigned int ); int notmain ( void ) { unsigned int ra; for(ra=0;ra<1000;ra++) dummy(ra); return(0); }
memap (script builder) MEMORY {RAM: ORIGIN = 0xD6000000, LENGTH = 0x4000} SECTIONS {.text: {(.text)}> ram} Makefile
ARMGNU = arm-none-eabi COPS = -Wall -O2 -nostdlib -nostartfiles -ffreestanding all : hello_world.bin clean : rm -f *.o rm -f *.bin rm -f *.elf rm -f *.list novectors.o : novectors.s $(ARMGNU)-as novectors.s -o novectors.o hello.o : hello.c $(ARMGNU)-gcc $(COPS) -c hello.c -o hello.o hello_world.bin : memmap novectors.o hello.o $(ARMGNU)-ld novectors.o hello.o -T memmap -o hello_world.elf $(ARMGNU)-objdump -D hello_world.elf > hello_world.list $(ARMGNU)-objcopy hello_world.elf -O binary hello_world.bin
hello_world.list (the parts we care about)
Disassembly of section .text: d6000000 <_start>: d6000000: eaffffff b d6000004 <reset> d6000004 <reset>: d6000004: e3a0d336 mov sp, #-671088640 ; 0xd8000000 d6000008: eb000004 bl d6000020 <notmain> d600000c: e59f0008 ldr r0, [pc, #8] ; d600001c <dummy+0x4> d6000010: e12fff30 blx r0 d6000014 <hang>: d6000014: eafffffe b d6000014 <hang> d6000018 <dummy>: d6000018: e12fff1e bx lr d600001c: d6000020 strle r0, [r0], -r0, lsr #32 d6000020 <notmain>: d6000020: e92d4010 push {r4, lr} d6000024: e3a04000 mov r4, #0 d6000028: e1a00004 mov r0, r4 d600002c: e2844001 add r4, r4, #1 d6000030: ebfffff8 bl d6000018 <dummy> d6000034: e3540ffa cmp r4, #1000 ; 0x3e8 d6000038: 1afffffa bne d6000028 <notmain+0x8> d600003c: e3a00000 mov r0, #0 d6000040: e8bd4010 pop {r4, lr} d6000044: e12fff1e bx lr
What I am showing here is a mixture of position-independent instructions and position-dependent instructions.
these two instructions, for example, are shortcuts that cause the assembler to add a style memory location in .word format, which the linker should then fill in for us.
ldr r0,=notmain blx r0
0xD600001c is the location.
d600000c: e59f0008 ldr r0, [pc, #8] ; d600001c <dummy+0x4> d6000010: e12fff30 blx r0 ... d600001c: d6000020 strle r0, [r0], -r0, lsr #32
and it is filled with the address 0xD6000020, which is an absolute address, so for this code to work, the notmain function must be at the address 0xD6000020, it does not move. but this part of the example also demonstrates some position-independent code,
ldr r0, [pc, #8]
- relative pc addressing. I talked about how this set of instructions works, during pc execution - two instructions in front, or basically in this case, if the instruction is in 0xD600000c in memory, then the computer will be 0xD6000014, then add 8 to it, as indicated in the instructions and you will get 0xD600001C. But if we moved this same machine code instruction to access 0x1000 and we move all the surrounding binaries there, including what it reads (0xD6000020). basically do this:
1000: e59f0008 ldr r0, [pc, #8] 1004: e12fff30 blx r0 ... 1010: d6000020
And these instructions that the machine code will still work, it does not need to be reassembled or re-linked. the code with the code 0xD6000020 should be on this fixed bit of the address ldr pc and blx dont.
Although the disassembler shows them with the addresses 0xd6 ..., bl and bne also refer to pc, which you can find out by looking at the documentation for the instruction set
d6000030: ebfffff8 bl d6000018 <dummy> d6000034: e3540ffa cmp r4, #1000 ; 0x3e8 d6000038: 1afffffa bne d6000028 <notmain+0x8>
0xD6000030 will have pc 0xD6000038 when executed and 0xD6000038-0xD6000018 = 0x20, which is 8 instructions. And the negative 8 in the double complement is 0xFFF..FFFF8, you can see that the main part of this machine code ebfffff8 is ffff8, which is an extension of the character and is added to the program counter to basically say that there are 8 instructions backward. The same goes for ffffa in 1afffffa, so if not equal, then put back 6 instructions. Remember that this set of commands (lever) assumes that the PC is two instructions forward, so back 6 means two forward and then back 6 or effectively back 4.
If you remove
d600000c: e59f0008 ldr r0, [pc, #8] ; d600001c <dummy+0x4> d6000010: e12fff30 blx r0
Then this whole program ends up being independent of position, by chance, if you do it (I accidentally found out that this will happen), but not because I told these tools, but simply because I did everything closely and didn't used no absolute addressing.
Finally, when you say “wherever the linker finds a place for them”, if you notice a script in my linker, I will tell the linker to start with 0xD6000000, I don’t specify any file names or functions, therefore, unless you say otherwise this the linker places the elements in the order in which they are specified on the command line. the hello.c code is the second, after the linker placed the novectors.s code, then wherever the linker takes place, immediately after that, the hello.c code starts with 0xD6000020.
And an easy way to see what is an independent provision and what isn't, without studying each instruction, is to change the script linker to put the code at some other address.
MEMORY { ram : ORIGIN = 0x1000, LENGTH = 0x4000 } SECTIONS { .text : { *(.text*) } > ram }
and see what machine code changes, if any, and what not.
00001000 <_start>: 1000: eaffffff b 1004 <reset> 00001004 <reset>: 1004: e3a0d336 mov sp, #-671088640 ; 0xd8000000 1008: eb000004 bl 1020 <notmain> 100c: e59f0008 ldr r0, [pc, #8] ; 101c <dummy+0x4> 1010: e12fff30 blx r0 00001014 <hang>: 1014: eafffffe b 1014 <hang> 00001018 <dummy>: 1018: e12fff1e bx lr 101c: 00001020 andeq r1, r0, r0, lsr #32 00001020 <notmain>: 1020: e92d4010 push {r4, lr} 1024: e3a04000 mov r4, #0 1028: e1a00004 mov r0, r4 102c: e2844001 add r4, r4, #1 1030: ebfffff8 bl 1018 <dummy> 1034: e3540ffa cmp r4, #1000 ; 0x3e8 1038: 1afffffa bne 1028 <notmain+0x8> 103c: e3a00000 mov r0, #0 1040: e8bd4010 pop {r4, lr} 1044: e12fff1e bx lr