Multi-line macros without backslash-newlines
Since comments are replaced by spaces in translation phase 3:
- The source file is divided into preprocessing tokens 7) and sequences of space characters (including comments). The source file should not end with a partial preprocessing token or partial comment. Each comment is replaced by one space. Newline characters are retained. Regardless, a sequence of space characters other than a newline is stored or replaced with a single space character determined by the implementation.
and the preprocessor works like phase 4:
- Preprocessor directives are executed, macro calls are expanded, and
_Pragma unary operator expressions are executed. If a sequence of characters that matches the syntax of the universal symbolic name created by the concatenation token (6.10.3.3), the behavior is undefined. A #include preprocessing directive causes a named header or source file to be processed from step 1. through phase 4, recursively. All preprocessing directives are then deleted.
it is possible, but absurd, to write a multi-line macro as follows:
#include <stdio.h> #define possible_but_absurd(a, b) /* comments */ printf("are translated"); /* in phase 3 */ printf(" before phase %d", a); /* (the preprocessor) */ printf(" is run (%s)\n", b); /* but why abuse the system? */ int main(void) { printf("%s %s", "Macros can be continued without backslashes", "because comments\n"); possible_but_absurd(4, "ISO/IEC 9899:2011,\nSection 5.1.1.2" " Translation phases"); return 0; }
which at startup indicates:
Macros can be continued without backslashes because comments are translated before phase 4 is run (ISO/IEC 9899:2011, Section 5.1.1.2 Translation phases)
Backslash is a new line in macro definitions
Translation steps 1 and 2 also matter:
- Multibyte characters of the physical source file are displayed in a manner defined by the implementation, to the character set of the source (introducing newline characters for line break indicators) if necessary. Trigraph sequences are replaced with the corresponding single-character internal representations.
Replacing a trigraph is nominally relevant because ??/ is a backslash trigraph.
- Each instance of a backslash character (
\ ) followed by a new line character is deleted, splicing the physical source lines to form logical source lines. Only the last backslash on any physical baseline should be allowed to be part of such a merge. The source file, which is not empty, must end with a newline, which must not be immediately preceded by a backslash before any such splicing takes place.
This suggests that by the time of phase 4 (preprocessor) macro definition is performed on one (logical) line - combinations of the final backslash-new line have been deleted.
In the standard note, the steps are โas ifโ โthe compilerโs behavior should be as if it went through separate phases, but many implementations do not formally separate them completely.
Avoid GCC Extensions
An extended example (quote from the GCC manual) has a spread of calls across many lines, but the definition is strictly on one line. (This is not a GCC extension, but standard behavior.)
Please note that if you are remotely configured, you will ignore the possibility of placing preprocessing directives in the macro call ( #undef and #define in the example). This is a GCC extension and is completely unsportsmanlike. The standard states that the behavior is undefined.
Appendix J.2 Undefined Behavior
- The macro argument list contains sequences of preprocessing tokens that would otherwise act as preprocessing directives (6.10.3).