What does it mean that the language of preprocessor directives is loosely related to C grammar?

The Wikipedia article on the C preprocessor says:

The preprocessor directive language is loosely coupled to C grammar, so it is sometimes used to process other types of text files.

How is a preprocessor language different from a C grammar? What are the benefits? Is the C preprocessor used for other languages ​​/ purposes?

Can it be used to differentiate built-in functions and macros, since built-in functions have the syntax of a normal C function, while macros use a slightly different grammar?

+8
c c-preprocessor
source share
3 answers

The Wikipedia article is not an authoritative source for the programming language C. The grammar of the preprocessor C is part of the grammar of C. However, it is completely different from the phrasal structure of the grammar , i.e. These 2 are not connected at all, except that both of them have an input consisting of C tokens (although the C preprocessor has the concept of a preprocessing number , which means that something like 123_abc is a legal preprocessing token, but it is not valid identifier).

After the preprocessing is completed and before the translation using the phrase structure grammar (preprocessor directives are now deleted, macros are expanded, etc.),

Each pre-processing token is converted to a token. ( C11 5.1.1.2p1, clause 7 )


Using the C preprocessor for any other languages ​​is indeed an abuse. The reason is that the preprocessor requires that the file consist of the necessary C preprocessing tokens. It is not designed to work with any other languages. Even C ++ with its recent extensions, such as string source literals , cannot be pre-processed by the C preprocessor!

Here's an excerpt from cpp (GNU C preprocessor) :

The C preprocessor is intended for use only with the C, C ++, and Objective-C source code. In the past, he was abused as a general word processor. It will suppress an input that does not obey the C lexical rule. For example, apostrophes will be interpreted as the beginning of symbolic constants and cause errors. In addition, you cannot rely on this while retaining input characteristics that are not significant for C-family languages. If the Makefile is preprocessed, all hard tabs will be deleted and the Makefile will not work.

+23
source share

The preprocessor creates preprocessing markers, which are later converted to C-tokens.

In general, the conversion is quite direct, but not always. For example, if you have a conditional preprocessor directive that evaluates to false, as in

 #if 0 comments #endif 

then in comments you can write whatever you want, it will be converted to preprocessing tokens that will never be converted to C-tokens, so inside the C source file you can paste the code without comment.

The only connection between the preprocessor language and C is that many tokens are defined almost identically, but not always.

for example, it is permissible to have preprocessor numbers (in the ISO9899 standard, called pp numbers), for example 4MD , which are valid preprocessor numbers but not valid C numbers. Using the ## operator, you can get a valid C identifier using these preliminary numbers processing. for example

 #define version 4A #define name TEST_ #define VERSION(x, y) x##y VERSION(name, version) <= this will be valid C identifier 

The preprocessor was designed to be applicable to any language for translating text, not referring to C. In C, it is generally useful to make a clear separation between interfaces and implementations.

+2
source share

The conditionals in the C preprocessor are valid C expressions, so the relationship between the preprocessor and the C language itself is close.

 #define A (6) #if A > 5 Here is a 6 #elif A < 0 # error #endif 

It expands to meaningless C, but can be meaningful text.

Here are 6

Although extended text is not valid for C, the preprocessor uses C functions to extend valid conditional strings. The C standard defines this in terms of a constant expression:

From standard C99 Β§6.6 :

6.10.1 Conditional inclusion

Form Processing Directives

# if constant-expression new-line group opt

# elif constant-expression new-line group opt

check if the expression of the control constant does not exceed a nonzero value.

And here is the definition of a constant expression

6.6 Constant Expressions

Syntax:

 constant-expression: conditional-expression 

Description A constant expression can be evaluated at translation time and not at run time, and, accordingly, can be used in any indicate that a constant can be.

Constraints Constant expressions must not contain assignment, increment, decrement, call, or comma operations, unless they are contained in a subexpression that is not evaluated.

Each constant expression must be evaluated by a constant that is in the range of representable values ​​for its type.

Given the above, it is clear that the preprocessor requires a limited form of evaluation of the expression of the C language, and, therefore, knowledge of the semantics of the system, grammar, and expression C.

0
source share

All Articles