Difference between LL parser and recursive descents?

I recently tried to teach myself how parsers work (for languages ​​/ context-free grammars), and most of them seem to make sense, except for one. I focused on LL (k) grammars in particular, for which the two main algorithms look like LL parser (using the stack / parsing table) and a recursive descent parser (just using recursion).

As far as I can see, the recursive descent algorithm works on all LL (k) grammars and possibly more, while the LL parser works on all LL (k) grammars. However, the recursive descent parser is very simple than the LL parser to implement (just as LL is simpler than LR).

So my question is: what are the advantages / problems that can arise when using any of the algorithms? Why would one choose LL over a recursive descent given that it works with the same set of grammars and is more difficult to implement?

+62
parsing context-free-grammar grammar recursive-descent ll
Jun 25 '09 at 15:32
source share
1 answer

LL is usually a more effective analysis method than recursive descent. In fact, a naive recursive descent parser will actually be O (k ^ n) (where n is the input size) in the worst case. Some methods, such as memoization (which the Packrat parser gives), can improve this, as well as expand the class of grammars accepted by the parser, but there is always a trade-off in space. LL analyzers (as far as I know) are always linear.

On the other hand, you are correct in your intuition that parsers with recursive descents can handle a higher class of grammars than LLs. A recursive descent can process any grammar that represents LL (*) (i.e. unlimited viewing), as well as a small set of ambiguous grammars. This is because a recursive descent is actually a direct PEG encoding, or grammar (expression) of an expression for a parser . In particular, the disjunctive operator ( a | b ) is not commutative, which means that a | b a | b not equal to b | a b | a . A parser with recursive descents will try to use each alternative in order. Therefore, if a matches the input, it will execute even if b matches the input. This allows us to classify the ambiguity of the β€œlong match”, for example, the problem of dangling else , simply by properly ordering the clauses.

With all that said, it is possible to implement the LL (k) parser using recursive descent so that it runs in linear time. This is done by essentially embedding the predictive sets, so that each parsing procedure determines the corresponding production for a given input at a constant time. Unfortunately, this technique eliminates the entire class of grammars from processing. Once we get into parsing, problems like the else frill are no longer resolved with such ease.

As to why LL is selected by recursive descents, it is mainly a matter of efficiency and maintainability. Parkurs with recursive descents are much easier to implement, but they are usually more difficult to maintain, since the grammar they represent does not exist in any declarative form. In most cases of non-trivial use of the parser, a parser generator such as ANTLR or Bison is used. With such tools, it really doesn’t matter if the algorithm is directly encoded recursive descent or controlled by the LL (k) table.

It is also interesting to pay attention to recursive ascent , which is a parsing algorithm that is directly encoded after a recursive -resonant, but able to process any LALR grammar. I could also delve into parser comparators , which are a functional way of linking recursive descent pars together.

+77
Jun 25 '09 at 15:45
source share



All Articles