Easy to disassemble DSL with pure C

I am working on a simple C application, and I had the idea of ​​creating a DSL to define some of the application’s actions. The idea is to create a very clean language similar to Ruby, but this is actually done in C. All functions are defined in C, DSL is just ... well, an alias to "hide" C.'s detailed syntax.

I know lex and yacc, but I think they are too big for what I'm trying to do. Isn’t it easier? I thought about regular expressions, but I would feel dirty doing it. Maybe something is better!

Example:

if a = b myFunctionInC() get 'mydata' then puts 'Hello!' 

Easily translates to:

 if (a == b) { myFunctionInC(); } void get(string test) { printf('Hello! %s', test); } 
+4
source share
4 answers

Defining good DSL syntax is difficult; you need to understand what problems you want to solve (and which ones you do not, otherwise it ends up with everything in it, including the kitchen sink), and you need to figure out how to translate it into the target language (or interpret it on the fly) .

In both cases, you need a parser, and the interesting DSL syntax usually makes no sense to parse with regular expressions. So you need a real parser generator. If you're going to do something like Ruby, you'll need a strong parser!

Then you need to fix the result of the analysis as some data structure, usually a tree. Then you need to analyze your DSL code for special cases, optimize and determine how to generate the code. All this means that the parser is usually not enough. See My extended discussion Life after Parsing .

+1
source

creating a DSL to identify certain types of application behavior. The idea is to create a very clean language similar to Ruby, but this is actually done in C.

C is not a good host for embedded languages. This is a good language for language execution, so if you want to script your application, think about what others are doing and attach a high-level language to your application.

Languages ​​such as Lua are designed for this purpose - it's easier to write than C; but with a simple attachment to C. You can also call C from Ruby or Python or Haskell or something else.

Reusing an existing language is a good idea, as someone has already done the hard work. You can also reuse libraries.

+3
source

I think that if you want to create a good language, you cannot rely only on regular expression, because it is poorly expressive.

It will also be difficult to write a regular expression to match a complex pattern.

If you just want to hide some C verbosity, you can use C MACRO

+1
source

I am working on a simple C application, and I came up with the idea of ​​creating a DSL to define some kinds of application behavior. The idea is to create a very clean language ... which actually runs in C.

You are not the first to have this idea. John Ousterhout made this idea popular with Tcl / Tk . Unfortunately, this language is not very clean.

The purest implementation of this idea available today is the Lua embedded language. It is very well designed and I recommend it very highly. The only reason to create your own (and not for Lua) is because you want to learn how to implement the built-in programming language. In this case, you can still learn a lot by exploring the design of Lua.

I know lex and yacc, but I think they are too big for what I'm trying to do. Isn’t it easier?

It is almost always easier to write a lexer manually than using lex.

Yacc is another story: in fact, there is nothing fundamentally simpler, because you really need to deal with the full power of context-free languages. But you can find this sophisticated technology in other packages (Lex and yacc are 1970s technologies designed for the hardware limitations of the 1970s and they are bad human interfaces.)

  • If you know how to create LL (1) grammar, then a hand-written recursive descent parser is very easy to write and does not require additional technologies. But knowledge is not easy to acquire, and coding these things in C is not very fun.

    If you want to learn, there are great examples in the books of Nicklaus Wirth. LL (1) can also have tutorials and a recursive descent online.

  • Perhaps it will be easier for you to use a more modern parser generator, not limited to LALR grammars (1). Perhaps, for example, the Elkhound parser generator. But this is also not easy.

+1
source

Source: https://habr.com/ru/post/1411366/


All Articles