Uses the `printf ("% .- 1s \ n "," foo ")` invoke undefined behavior?

In accordance with the standards :

Each conversion specification is entered with a% character. After%, the following sequences appear:

  • Zero or more flags [...].
  • Optional minimum field width. [...]
  • An optional precision that gives [...] the maximum number of bytes that must be written for s conversions. Accuracy takes the form of a period (.), Followed by an [...] optional decimal integer;
  • Additional length modifier [...]. + Transform specifier character [...].
  • Optional minimum field width. [...]
  • The conversion specifier character [...].

Later:

The negative precision argument is accepted as if the precision were omitted.

What would I expect from printf("%.-1s\n", "foo") according to how I interpret the standard definition:

The second quote, taken from the standard, suggests that we can pass an argument of negative precision and that such accuracy will be ignored.

So printf("%.-1s\n", "foo") should be equivalent to printf("%s\n", "foo") , which displays "foo\n" and returns 4 .

However, here is the actual behavior of printf("%.-1s\n", "foo") for the system used (osx):

printf("%.-1s\n", "foo") displays " \n" and returns 2 .

This is clearly different from what I expected.

  • Is my interpretation of the standards somehow wrong?
  • Is this behavior undefined?
  • Is it possible to convey negative accuracy (edit: without an asterisk)?
+7
c undefined-behavior standards language-lawyer printf
source share
3 answers
  • Is my interpretation of the standards somehow wrong?

I interpret your interpretation as follows:

So printf("%.-1s\n", "foo") should be equivalent to printf("%s\n", "foo") , which displays "foo\n" and returns 4.

Not. Your statement on ignored negative precision arguments does not apply to this case. This statement indicates the possibility of specifying precision as * in the format string and passing the value as a separate printf argument:

 printf("%.*s\n", -1, "foo"); 

In this case, the negative precision argument causes printf() behave as if precision were not specified. Your case is different.

On the other hand, the standard here does not require that the precision value displayed in the format string be a non-negative decimal integer. He really qualifies the term β€œdecimal integer” in this way in several other places, including in the same section, but this is not done in the paragraph on the precision field.

  • Is this behavior undefined?

Not. There are two conflicting interpretations of the required semantics (see next), but in any case, the standard defines behavior. It can be interpreted as

  • the behavior described for the negative precision argument also applies when a negative precision value is represented directly in the format string. This has the advantage of consistency, and this is the behavior you report. Nevertheless,

  • a literal reading of the standard indicates that when precision is represented as a negative decimal integer in a format string, then the usual semantics described in this section apply; for s directives, this means that negative precision expresses the maximum number of characters that will be output.

The behavior that you observe is incompatible with the previous interpretation, but given the practical difficulties in outputting less than 0 bytes, it seems to me a little surprise that the latter interpretation was not successfully implemented. I am inclined to assume that the latter is what your implementation is trying to implement.

I suspect that at some stage there was an unintentional omission in order to leave the opportunity to give a negative value for the accuracy field, but intentionally or not, the standard seems to allow this.

+1
source share

N1570-Β§7.21.6.1 / p5:

As noted above, the field width, or accuracy, or both, can be indicated by an asterisk . In this case, the int argument provides the width or precision of the field. Arguments indicating the width of the field, or precision, or both, must be displayed (in that order) before converting the argument (if any). The negative field width parameter is taken as a flag, followed by a positive field width. The negative precision argument is accepted as if the precision were omitted .

The standard indicates that this is applicable only when the asterisk is used as precision in the format string, and a negative value is passed as the argument below

 printf("%.*s\n", -1, "foo"); // -1 will be ignored 

The fourth paragraph says:

[...] Accuracy takes the form of a period ( . ), Followed by either asterisk * (described below) or an optional decimal integer ; [...]

but it does not specifically specify whether the decimal integer must be greater than 0 (as said in the case of the scanf field width in section 7.21.6.2/p3). The standard seems ambiguous at this point, and the result may be machine dependent.

+4
source share

In a format like "%-5d" , the width is not -5; instead, the character is the flag character, which indicates that the value should be left-aligned in a field of a given width. The use of "non-negative" for width is due to the fact that the "-" symbol is a flag symbol, not a sign. Although the Standard does not indicate that accuracy should be non-negative, it is difficult to imagine any inappropriate goal that could be satisfied that an implementation that meets a β€œ-” between a period and some decimal digits should ignore the content of these digits. It is possible that some implementations can handle such things this way, but probably many implementations do not have any code for explicitly treating β€œ-” in this position and will either treat it the same as β€œ-" at the beginning of the format or consider it like any other character without a specific meaning, depending on which was more convenient. I see no reason to consider the behavior "defective."

+1
source share

All Articles