Returning meaningful error messages from an analyzer written with Scala Parser Combiners

I am trying to write a parser in scala using Parser Combiners. If I match recursively,

def body: Parser[Body] = ("begin" ~> statementList ) ^^ { case s => { new Body(s); } } def statementList : Parser[List[Statement]] = ("end" ^^ { _ => List() } )| (statement ~ statementList ^^ { case statement ~ statementList => statement :: statementList }) 

then I get good errormessages whenever there is an error in the statement. However, this is an ugly long code. Therefore, I would like to write the following:

 def body: Parser[Body] = ("begin" ~> statementList <~ "end" ) ^^ { case s => { new Body(s); } } def statementList : Parser[List[Statement]] = rep(statement) 

This code works, but only prints meaningful messages if there is an error in the FIRST instruction. If this is in a later statement, the message becomes painfully unusable because the parser wants to see that the entire erroneous instruction is replaced with the "end" marker:

 Exception in thread "main" java.lang.RuntimeException: [4.2] error: "end" expected but "let" found let b : string = x(3,b,"WHAT???",!ERRORHERE!,7 ) ^ 

My question is: is there a way to get rep and repsep to work in conjunction with meaningful error messages that put the carriage in the right place instead of starting a repeating fragment?

+4
source share
2 answers

Ah, I found a solution! Turns out you need to use a functional phrase on your main parser to return a new parser that is less inclined to track. (I wonder what exactly this means, maybe if he finds a line break, he wonโ€™t track it?) Keeps track of the last position in which the failure occurred.

changes:

 def parseCode(code: String): Program = { program(new lexical.Scanner(code)) match { case Success(program, _) => program case x: Failure => throw new RuntimeException(x.toString()) case x: Error => throw new RuntimeException(x.toString()) } } def program : Parser[Program] ... 

in

 def parseCode(code: String): Program = { phrase(program)(new lexical.Scanner(code)) match { case Success(program, _) => program case x: Failure => throw new RuntimeException(x.toString()) case x: Error => throw new RuntimeException(x.toString()) } } def program : Parser[Program] ... 
+1
source

You can do this by combining the "home made" rep method with internal statements without backtracking. For instance:

 scala> object X extends RegexParsers { | def myrep[T](p: => Parser[T]): Parser[List[T]] = p ~! myrep(p) ^^ { case x ~ xs => x :: xs } | success(List()) | def t1 = "this" ~ "is" ~ "war" | def t2 = "this" ~! "is" ~ "war" | def t3 = "begin" ~ rep(t1) ~ "end" | def t4 = "begin" ~ myrep(t2) ~ "end" | } defined module X scala> X.parse(X.t4, "begin this is war this is hell end") res13: X.ParseResult[X.~[X.~[String,List[X.~[X.~[String,String],String]]],String]] = [1.27] error: `war' expected but ` ' found begin this is war this is hell end ^ scala> X.parse(X.t3, "begin this is war this is hell end") res14: X.ParseResult[X.~[X.~[String,List[X.~[X.~[String,String],String]]],String]] = [1.19] failure: `end' expected but ` ' found begin this is war this is hell end ^ 
+1
source

All Articles