Why does the C / C ++ preprocessor add a space here?

I have a little problem with the preprocessor that puzzles me, and I cannot find any explanation for this in the documentation / preprocessor / language specification.

#define booboo() aaa booboo()bbb booboo().bbb 

pre-processed in:

 aaa bbb <--- why is space added here aaa.bbb 

After processing trigrams, continued lines, and comments, the preprocessor works on the preprocessor directives and divides the input into preprocessing tokens and spaces. The booboo replacement list contains one pp token, which is the identifier "aaa". booboo () bbb is divided into pp tokens: "booboo", "(',')", "bbb". The sequence 'booboo', '(', ')' is recognized as a macro function call and should be expanded to 'aaa', and imho in the view should look like aaabbb. I said that from then on - for a person - it will look like a single token, while the compiler will get 2 tokens β€œaaa” and β€œbbb”, since the operator ## was not used, which allows the concatenation of pp-token. Why / what rule does cpp (c preprocessor) put extra space between 'aaa' and 'bbb' when 'booboo (). Does bbb 'result in' aaa.bbb 'without a space?

Is it because cpp is trying to make a conclusion (which is mostly for people) unambiguous? A person cannot say that "aabb" consists of two tokens, since he sees only spelling. I'm right? I read the C99 documentation on the preprocessor and gcc documentation for cpp. I don’t see anything about it.

If I am right, we have a similar situation:

 #define baba() + baba()+ baba()- 

leads to:

 + + +- 

Otherwise (if β€œ++” is the output), it will look like a person similar to the β€œ++” token, but there will be two tokens β€œ+” and β€œ+”. It seems that with the operator '##', that cpp checks to see if the concatenation gives a valid token, but in the cases shown it wants the person not to perform this concatenation? '+ -' is not ambiguous, so no space is added

+8
c ++ c c-preprocessor
source share
2 answers

The result of the pre-processing is the conversion of the source file to a list of tokens. In your case, the list of tokens will look after tokenization:

 .... booboo() bbb .... 

and then after replacing the macro:

 .... aaa bbb .... 

The compiler then translates the list of tokens into an executable file.

The spaces you see are just an implementation detail that your compiler, etc. I chose to display preprocessing markers when displaying an intermediate result. Standards do not say anything about any intermediate processing files. No separate preprocessing program required.

+6
source share

I wrote the ANSI C compiler myself in the early 90s. As far as I remember, the comment token /....../ should be replaced with a single space. Macros replace text in place. It is not necessary that the tokens resulting from replacing the text of such macro extension (s) be legal C lexemes. When a macro is defined as the text "aaa", simply the text "aaa" breaks into the input stream. The c analyzer may or may not see valid tokens as a result of this!

Therefore, this:

define booboo () aaa

The bobb booboo () extension should result in aaabbb text

What this aaabbb means is user dependent. But this aaabbb will not be pre-processed, even if it is a macro name. That's for sure. But aaabbb can be a user id - no problem.

-one
source share

All Articles