Improving ANTLR DSL Analysis Error Messages

Question

Improving ANTLR DSL Analysis Error Messages

I am working on a domain language (DSL) for non-programmers. Non-programmers make many grammatical errors: they skip keywords, do not close parentheses, they do not interrupt blocks, etc.

I use ANTLR to generate my parser; it provides an excellent mechanism for processing RecognitionExceptions to improve error handling. But it is very difficult for me to develop good error handling code for my DSL.

At this point, I am considering ways to simplify the language to make it easier for me to provide users with high-quality error messages, but I'm not quite sure how to do this. I think I want to somehow reduce the ambiguity of errors, but I'm not sure how to implement this idea in grammar.

How can I simplify my language to improve parsing messages for my users?

EDIT: Updated to clarify that I'm interested in ways to simplify my language, not just tips for handling ANTLR errors in general. (Although, thanks for that!)

+6

error-handling antlr dsl

Dan fabulich Feb 14 '10 at 23:45

source share

4 answers

Alex miller · Answer 1 · 2010-02-15T14:39:15+0000

I wrote an article on recovering row and column numbers in ANTLR errors a couple of years ago that might be useful.

http://tech.puredanger.com/2007/02/01/recovering-line-and-column-numbers-in-your-antlr-ast/

chollida · Answer 2 · 2010-02-15T14:34:53+0000

You've probably come across the most difficult part of using a parser generator compared to manual grammar.

From my experience, the first thing you need to do is make sure that you accurately track information about rows and columns so that you can tell the user the exact place where the parser considers the error.

This should take care of 90% of the problems for users, i.e. skip commas or half-columns at the end of the line.

This is the other 10% where the problem is.

I usually start by providing a meaningful name to my lexical and grammatical tokens using the keyword paraphrase .

t

 SEMI options {paraphrase="end of line terminator";} : ';' ; ifExpr options {paraphrase="boolean expression";} : expr ;

Antlr will use these phrases in any error message that it generates.

Take a look at this page: http://www.antlr2.org/doc/err.html to find out how experts recommend you do this with Antlr 2, and then browse this page: http://www.antlr.org /blog/antlr3/error.handling.tml to see the changes made by Antlr 3. (The Antlr2 page is probably the best place to start).

perimosocordiae · Answer 3 · 2010-02-14T23:56:43+0000

I recently read an article about someone who has implemented a simple learning mechanism for their parser. Basically, the idea is to mark the parsing errors that ANTLR gives you with the actual cause of the error. For example,

Error: no "bar" method for NilClass: foo

may be marked as:

Error: tried to call "bar" on foo, but foo does not matter.

The idea is actually taken from a 2003 document: Generating LR syntax error messages from examples . It was also discussed on the research blog ! Rsc .

Robert Petermeier · Answer 4 · 2010-02-15T18:08:24+0000

Ok, I still haven't used ANTLR, only JavaCC. But since you're going to implement DSL and care about usability, you should take a look at xtext . This is a structure that

Lets you specify the text grammar for your DSL in EBNF notation.
generates a parser for you
creates an editor with syntax highlighting and immediate feedback on syntax errors as an Eclipse plugin
gives you access to the base AST to transform the textual representation that your users create into anything.

Last year, I attended the presentation of itemis , a German company specializing in DSL. I was very impressed with how easy this material is to set up and work. I used it to create an editor for a small game that uses a textual description of the playing field, which is then analyzed and converted into a model of the game object.

Improving ANTLR DSL Analysis Error Messages

More articles: