Like "for (; * p; ++ p) * p = tolower (* p);" work in c?

I'm new to programming and I'm just wondering why this code is:

for ( ; *p; ++p) *p = tolower(*p); 

works to omit string case in c when p points to a string?

+5
source share
2 answers

To unlock, suppose p is a pointer to char and just before the for loop, it points to the first character in the string.

In C, strings are typically modeled by a set of contiguous char values ​​with an ending 0 added at the end, which acts as a null delimiter.

*p will evaluate to 0 after reaching the null terminator of the string. Then the for loop will exit. (The second expression in the for loop acts like a completion test).

++p goes to the next character in the string.

*p = tolower(*p) sets this character to lowercase.

+2
source

In general, this code:

 for ( ; *p; ++p) *p = tolower(*p); 

not

" works to omit string case in c when p points to a string?

It works for pure ASCII, but since char usually a signed type, and since tolower requires a non-negative argument (except for the special value of EOF ), in general, the element is Undefined Behavior.

To avoid this, enter the unsigned char argument, for example:

 for ( ; *p; ++p) *p = tolower( (unsigned char)*p ); 

Now it can work for single-byte encodings such as Latin-1, provided that you set the correct locale via setlocale , for example. setlocale( LC_ALL, "" ); . However, note that the very common UTF-8 encoding is not one byte per character. To work with UTF-8 text, you can convert it to a wide string and lowercase.


More details:

  • *p is an expression that denotes the object pointed to by p , presumably a char .

  • As a condition for continuing the for loop, any non-zero char value that is indicated by *p has a logical True effect, and a value of zero char at the end of the line has a logical False effect, ending the loop.

  • ++p points to the pointer to the next char .

+3
source

All Articles