It seems that in the first two definitions, gcc inserts space after the extended macro to make sure it is a separate token. Is this really what is going on here?
Yes.
Is this mandatory for any standard (I could not find documentation on this topic)?
Yes, although a few more spaces will be allowed to implement tokens.
f()_bar
here you have 4 tokens after lexical analysis (they are actually tokens in front of the processor at this stage, but they can be called tokens): f , ( , ) and _bar .
A functionally selected macro replacement semantics (as defined in C11, 6.10.3) should replace 3 tokens f , ( , ) with the new foo . It is not allowed to work with other tokens and change the last _bar token. For this, the implementation must insert at least one space in order to preserve the _bar token. Otherwise, the result would be foo_bar , which is the only token.
gcc preprocessor has several docs here:
Once the input file is split into tokens, the boundaries of the tokens never change, unless the ## preprocessing operator is used to combine tokens. See Concatenation. For example,
#define foo() bar foo()baz ==> bar baz not ==> barbaz
In another case, for example f()-bar , there are 5 tokens: f , ( , ) , - and bar . ( - is the accent token in C, while _ in _bar is just the symbol of the identifier token). The implementation should not insert a marker delimiter (like a space) here, because after replacing the macro, -bar are still considered two separate tokens from the C syntax.
gcc preprocessor ( cpp ) does not insert spaces here simply because it does not need it. In the cpp documentation , on the token interval it is written (on another problem):
However, we would like to keep the spatial insertion to a minimum, both for aesthetic reasons and because it creates problems for people who are still trying to abuse the preprocessor for things like the Fortran source and Makefiles.
I did not consider the solution to your problem in this answer, but I think that you should use the operator explicitly specified for combining tokens: ## marker gluing operator.