Where is the string data stored?

I wrote a small program c:

#include <stdio.h> int main() { char s[] = "Hello, world!"; printf("%s\n", s); return 0; } 

which compiles (on my Linux machine):

  .file "hello.c" .text .globl main .type main, @function main: .LFB0: .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset 6, -16 movq %rsp, %rbp .cfi_def_cfa_register 6 subq $32, %rsp movq %fs:40, %rax movq %rax, -8(%rbp) xorl %eax, %eax movl $1819043144, -32(%rbp) movl $1998597231, -28(%rbp) movl $1684828783, -24(%rbp) movw $33, -20(%rbp) leaq -32(%rbp), %rax movq %rax, %rdi call puts movl $0, %eax movq -8(%rbp), %rdx xorq %fs:40, %rdx je .L3 call __stack_chk_fail .L3: leave .cfi_def_cfa 7, 8 ret .cfi_endproc .LFE0: .size main, .-main .ident "GCC: (Ubuntu/Linaro 4.7.2-2ubuntu1) 4.7.2" .section .note.GNU-stack,"",@progbits 

I donโ€™t understand the assembly code, but I donโ€™t see anything in the string message. So how does the executable know what to print?

+6
source share
5 answers

Here:

 movl $1819043144, -32(%rbp) ; 1819043144 = 0x6C6C6548 = "lleH" movl $1998597231, -28(%rbp) ; 1998597231 = 0x77202C6F = "w ,o" movl $1684828783, -24(%rbp) ; 1684828783 = 0x646C726F = "dlro" movw $33, -20(%rbp) ; 33 = 0x0021 = "\0!" 

In this particular case, the compiler generates built-in commands to generate a literal constant before invoking printf . Of course, in other situations, it may not do this, but instead may store the string constant in another section of memory. Bottom line: you cannot make any assumptions about how and where the compiler will generate and store string literals.

+12
source

The line is here:

 movl $1819043144, -32(%rbp) movl $1998597231, -28(%rbp) movl $1684828783, -24(%rbp) 

This copies a bunch of values โ€‹โ€‹onto the stack. These values โ€‹โ€‹are your string.

+3
source
String constants

stored in binary format of your application. Exactly where your compiler is.

+1
source

In the assembly there is no concept of "string". Thus, a โ€œstringโ€ is actually a piece of memory. The string is stored somewhere in memory (up to the compiler), then you can manipulate this piece of data using its memory address (pointer).

If your string is constant , the compiler may want to use it as constants rather than storing it in memory, which is faster. This is your case, as Paul R noted:

 movl $1819043144, -32(%rbp) movl $1998597231, -28(%rbp) movl $1684828783, -24(%rbp) 

You cannot make assumptions about how the compiler will process your string.

+1
source

In addition to the above, the compiler can see that your string literal cannot be directly referenced (i.e. there cannot be any valid pointers to your string), so it can simply copy it to a string. If, however, you assign a pointer to a character, i.e.

char *s = "Hello, world!";

The compiler initializes the string literal somewhere in memory, since of course you can now point to it. This modification is created on my machine:

 .LC0: .string "Hello, world!" .text .globl main .type main, @function 

You can make one assumption about string literals: if the pointer is initialized with a literal, it points to a static char array stored somewhere in memory. As a result, the pointer is valid in any part of the program, for example. you can return a pointer to a string literal initialized in a function, and it will still be valid.

0
source

All Articles