I encoded a parser based on Scala parser compilers:
class SxmlParser extends RegexParsers with ImplicitConversions with PackratParsers { [...] lazy val document: PackratParser[AstNodeDocument] = ((procinst | element | comment | cdata | whitespace | text)*) ^^ { AstNodeDocument(_) } [...] } object SxmlParser { def parse(text: String): AstNodeDocument = { var ast = AstNodeDocument() val parser = new SxmlParser() val result = parser.parseAll(parser.document, new CharArrayReader(text.toArray)) result match { case parser.Success(x, _) => ast = x case parser.NoSuccess(err, next) => { tool.die("failed to parse SXML input " + "(line " + next.pos.line + ", column " + next.pos.column + "):\n" + err + "\n" + next.pos.longString) } } ast } }
Typically, parsing error messages received are pretty nice. But sometimes it just becomes
sxml: ERROR: failed to parse SXML input (line 32, column 1): `"' expected but `' found ^
This happens if quotation marks are not closed and the parser reaches EOT. What I would like to see here is (1) in which work the parser had when it expected "(I have several) and (2) where this production started parsing at the input (this is an indicator, an introductory quote is in the input.) Does anyone know how I can improve error messages and include more information about the actual internal syntax state when an error occurs (maybe something like a stacktrace production rule or something, the location of the error). the above row is "row 32, column 1" actually i It appears to be an EOT position and therefore useless here.
scala error-handling parser-combinators
Ralf S. Engelschall
source share