Character constant: \ 000 \ xhh

Can someone explain the use of the character constant \ 000 and \ xhh, i.e. octal numbers and hexadecimal numbers in character constant?

+6
c
source share
3 answers

In C, strings end with a character with a null value (0). This can be written like this:

char zero = 0; 

but this does not work inside the lines. There is a special syntax used in string literals where the backslash works like introducing an escape sequence and comes with various things.

One such sequence is a backslash, which simply means a character with a null value. So you can write things like this:

 char hard[] = "this\0has embedded\0zero\0characters"; 

In another sequence, a backslash is used followed by the letter 'x' and one or two hexadecimal digits to represent the character with the specified code. Using this syntax, you can write a null byte as '\x0' , for example.

EDIT : re-reading the question, there are also supported such constants in the base of eight, that is, octal. They use a backslash followed by a digit zero, just like octal integer constants. Thus, '\00' is synonymous with '\0' .

This is sometimes useful when you need to build a string containing non-printable characters or special control characters.

There is also a set of one-character "named" special characters, such as '\n' for a new line, '\t' for TAB, etc.

+3
source share

Those that will be used to write other non-printable characters in the editor. For standard characters, these will be various control characters; for wchar, these may be characters that are not represented in the font of the editor.

For example, this compiles in Visual Studio 2005:

  const wchar_t bom = L'\xfffe'; /* Unicode byte-order marker */ const wchar_t hamza = L'\x0621'; /* Arabic Letter Hamza */ const char start_of_text = '\002'; /* Start-of-text */ const char end_of_text = '\003'; /* End-of-text */ 

Edit: using octal letters has an interesting caveat. Octal numbers can be no more than three digits, which artificially limits the characters that we can enter.

For example:

  /* Letter schwa; capital unicode code point 0x018f (octal 0617) * small unicode code point 0x0259 (octal 1131) */ const wchar_t Schwa2 = L'\x18f'; /* capital letter Schwa, correct */ const wchar_t Schwa1 = L'\617'; /* capital letter Schwa, correct */ const wchar_t schwa1 = L'\x259'; /* small letter schwa, correct */ const wchar_t schwa2 = L'\1131'; /* letter K (octal 113), incorrect */ 
+4
source share

Octal is base 8 (using digits 0-7), so each digit has 3 bits:

\ 0354 = 11 101 100

Hexadecimal is base 16 (using the digits 0-9, AF), and each digit has 4 bits:

\ x23 = 0010 0011

Inside C strings (char arrays / pointers), they are usually used to encode bytes that cannot be easily represented.

So, if you need a string that uses ASCII codes such as STX and ETX, you can do:

 char *msg = "\x02Here my message\x03"; 
+2
source share

All Articles