What are the C / C ++ length preprocessors as a tool for creating a language? Where can I find out more about this?

In his FAQ, Bjarne Straustrup says:

To build [Cfront, the first C ++ compiler], I first used C to write the "C with classes" -to-C preprocessor. "C with classes" was a dialect of C that became an immediate ancestor of C ++ ... Then I wrote the first version of Cfront in "C with classes".

When I read this, it sparked my interest in the C preprocessor. I saw its macro capabilities as suitable for simplifying general expressions, but did not think about its ability to significantly add syntax and semantics at the level that I assume brought classes to C.

So now I have some questions in my opinion:

  • Are there any other examples of this approach for C language bootstrapping?

  • Is the source of Stroustrup source work available anywhere?

  • Where can I learn more about the features of using this technique?

  • What are the lengths and limits of this approach? Is it possible, for example, to create a set of preprocessor macros that allow someone to write something significantly Lisp / Scheme like?

+6
c ++ c c-preprocessor language-design
source share
7 answers

For an example of the “language” monster type that you can create using the C preprocessor, look at this header file:

http://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/sh/mac.h

This is from the source code of the original Unix shell written by Steve Born, and it aims to turn C into an algol-like language. Here is an example of what a piece of code looks like when using it:

http://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/sh/args.c

It looks strange, but it is still C. It may look like another language, but since it is implemented in the preprocessor, there is no syntactic support for it, for example.

WHILE foo DO SWITCH .... ENDSW OD 

everything is very good and compiles beautifully, but also

 WHILE foo DO SWITCH .... OD ENDSW 
+4
source share

Note that Stroustrup does not say that he used the C preprocessor (cpp) to create C with classes - he did not. He wrote his own preprocessor using C. And Cfront was a real compiler, not a preprocessor. The C preprocessor is actually extremely unsuitable for the development of the language, since it does not have any syntactic capabilities.

+12
source share

The C preprocessor is not what you are looking for. It cannot add syntax and semantics, such as Cfront.

Cfront was the actual compiler that translated C with classes, later C ++, to C. It was a preprocessor only in that it ran before C compiler. I used a program called f2c once to translate FORTRAN 77 code to C code He worked on the same principle.

There are languages ​​like Common Lisp with enough macro power to add new syntax and semantics, as well as languages ​​like Forth, where the system is flexible enough to accommodate changes, but this will not work for most languages.

+4
source share

As mentioned by others, C ++ was not created using the C preprocessor (CPP).

However, you can do crazy things with CPP and recursion; I am pretty sure that Turing is complete. The libraries I'm going to link to use a lot of ugly tricks to get interesting behavior. Although you can create a kind of elegance from above, many may consider it Turing Turing .

For a softer introduction to this material, try Cloak .

To go deeper look

  • Boost is a cross platform, but more ugly; part of the popular C ++ library.
  • Chaos - follow the guy Boost-pp, but only supports C99-compatible tools, ergo more elegant
  • Order - from what I can say, Lisp is a chaos-like language built on pure CPP

eg. with order or chaos, you can write a recursive fibonacci sequence generator in pure CPP.

+3
source share

I think Objective-C started in the same way. It was a preprocessor that built some C code, which was then passed to the C compiler. But it was not a CC preprocessor in the sense of #define FOO , it was executed as an additional step before or after the standard preprocessor C. The result of any number of preprocessor steps can then be sent C. compiler

+2
source share

It seems that his “C with classes” -to-C preprocessor was not the same as the standard C preprocessor, since it talks specifically about writing this preprocessor.

C preprocessor is very limited. It can be used to create abbreviations for common expressions, but more on that. When you try to identify new language constructs with it, it quickly becomes more bulky and fragile.

+1
source share

I suggest you start with the GCC Macros documentation , which contains some pretty interesting information about the CCC implementation of the CCC preprocessor.

The glue bridges in his answer give several examples of using the C preprocessor. One example of an ordering language is interesting in a number of examples. The author of the Order raises one problem that he / she faced, C Preprocessor implementations may not fully implement more modern standards.

In general, using the C C preprocessor to create some sort of underlined language, such as what Steve Bourne did when writing the Bourne Shell for Unix, I would consider suitable grounds for execution, and then several water landing sessions.

The main thing to remember about the C preprocessor is that it manipulates text tokens. Thus, the C preprocessor will allow you to tinker with the syntax quite a bit. For example, the following macro, which compiles with Visual Studio 2005 without errors, shows the impossibility of using non-intuitive text.

 #define TESTOP(a,x,y,op,z) a (x) op (y); z void f(void) { int i = 0, j = 5; TESTOP( ,i,j,+=, ); TESTOP( ,i,(j + 2),+=, ); TESTOP({,i,(j + 2),+=,}); } 

However, you need to understand and get around some of the limitations of the C preprocessor when pushing borders. See GCC Macro Pitfalls topics for some of the topics at hand.

And you can use the C preprocessor as a general macro and text preprocessor that targets some tool other than the C compiler. For example, the older imake program used C preprocessor to automate the assembly to provide an extensive macro tool.

Where I saw that the C-preprocessor used most efficiently was to simplify complex code and declarations.

In one case that I saw, the C preprocessor was used to provide the state machine language, which was used to create data structures and data to describe the state machine. The resulting data structures were then used as an argument for the state machine function. This allowed us to write down several different state machine procedures in the C preprocessor language with the state machine processing performed using one function.

Microsoft, in its Microsoft Foundation Classes (MFC), used the C preprocessor to hide quite a bit of the details of MFC messaging. Once you get used to it, it's easy enough to read something like the following. Since the Visual Studio development environment had tools for generating and modifying code using macros, it was quite simple for the programmer.

 BEGIN_MESSAGE_MAP(CFrameworkWndDoc, CWindowDocument) //{{AFX_MSG_MAP(CFrameworkWndDoc) ON_WM_CHAR() ON_WM_TIMER() ON_MESSAGE(WU_EVS_DFLT_LOAD, OnDefaultWinLoad) ON_MESSAGE(WU_EVS_POPUP_WINDOW, OnPopupWindowByName) ON_MESSAGE(WU_EVS_POPDOWN_WINDOW, OnPopdownWindowByName) ON_MESSAGE(WM_APP_CONNENGINE_MSG_RCVD, OnConnEngineMsgRcvd) ON_MESSAGE(WM_APP_XMLMSG_MSG_RCVD, OnXmlMsgRcvd) ON_MESSAGE(WM_APP_BIOMETRIC_MSG_RCVD, OnBiometricMsgRcvd) ON_MESSAGE(WM_APP_SHUTDOWN_MSG, OnShutdownMsgRcvd) ON_MESSAGE(WM_POWERBROADCAST, OnPowerMsgRcvd) ON_MESSAGE(WM_APP_SHOW_HIDE_GROUP, OnShowHideGroupMsgRcvd) //}}AFX_MSG_MAP END_MESSAGE_MAP() 

Especially when you see how macros look:

  #define BEGIN_MESSAGE_MAP(theClass, baseClass) \ PTM_WARNING_DISABLE \ const AFX_MSGMAP* theClass::GetMessageMap() const \ { return GetThisMessageMap(); } \ const AFX_MSGMAP* PASCAL theClass::GetThisMessageMap() \ { \ typedef theClass ThisClass; \ typedef baseClass TheBaseClass; \ static const AFX_MSGMAP_ENTRY _messageEntries[] = \ { #define END_MESSAGE_MAP() \ {0, 0, 0, 0, AfxSig_end, (AFX_PMSG)0 } \ }; \ static const AFX_MSGMAP messageMap = \ { &TheBaseClass::GetThisMessageMap, &_messageEntries[0] }; \ return &messageMap; \ } \ PTM_WARNING_RESTORE // for Windows messages #define ON_MESSAGE(message, memberFxn) \ { message, 0, 0, 0, AfxSig_lwl, \ (AFX_PMSG)(AFX_PMSGW) \ (static_cast< LRESULT (AFX_MSG_CALL CWnd::*)(WPARAM, LPARAM) > \ (memberFxn)) }, #define ON_WM_TIMER() \ { WM_TIMER, 0, 0, 0, AfxSig_vw, \ (AFX_PMSG)(AFX_PMSGW) \ (static_cast< void (AFX_MSG_CALL CWnd::*)(UINT_PTR) > ( &ThisClass :: OnTimer)) }, 
0
source share

All Articles