Multi-line comments and newlines

In C ++ 11, the standard says in 2.2.3:

Each comment is replaced by a single space. Newline characters are retained.

  • Is this phrase consistent, which means saving a new line for a comment that ends with a new line?

  • If (1) is true, then why does Visual C ++, gcc, and clang keep an empty line for each line in multi-commercial form.

These questions are important because I am writing a C ++ preprocessor.

+4
source share
2 answers

The new lines in question are those that still exist after replacing comments with a single space character. This is clearer when a fragment is viewed in the wider context of the paragraph in which it is contained.

Thus, newlines in multi-line comments are not saved and do not interrupt preprocessing directives.

+2
source

The C / C ++ preprocessor breaks all comments, but usually saves the original lines with the same line numbers when you look at the output of the preprocessor.

This means that the compiler that reads the output of the preprocessor can print the correct line numbers for error messages and warnings.

Preprocessors usually also keep all empty lines as they are.

You also need to strictly distinguish between multi-line macros that are removed from the source and when they expand. They are always deleted, retaining all linear translations. They are always replaced by all lines. Both are completely independent operations that have nothing to do with each other.

In the old days, the C preprocessor always gave its output to stdout, and the C compiler read it from stdin. The preprocessor emits the internal operators #<N> "<FILE>" , which the C compiler interprets as "line number N". Thus, the preprocessor could theoretically do without emitting empty lines in the output. But in practice, this function #<N> "<FILE>" used only for lines following the #include statements.

Today, the preprocessor is built into the C compiler for performance, but you can view the intermediate result if explicitly requested.

Note. See also a good comment below. The standard does not specify how the output of the preprocessor text looks from the point of view of white spaces. Text output is implementation specific. There is enough room for interpretation. It is determined where at least one white space should be and that all cursors remain on the source lines (or are marked with their source line) so that the error messages make sense.

+2
source