How to combine parsers with various types of Elem

I am trying to create a parser that combines Regex parsers and my own parser. I looked at Scala: how to combine parser combinators with different objects , but this question and answers concern parsers that have the same Elem type.

Let's say I have several RegexParsers, as well as a parser that does a string search:

 trait NumbersParsers extends RegexParsers { def number = """\d+""".r } trait LookupParsers extends Parsers { type Elem = String def word = elem("word", (potential:String) => dictionary.exists(_.equals(x)) } 

If I combine these parsers naively

 object MyParser extends NumbersParsers with RegexParsers { def quantitive = number ~ word } 

I obviously get type errors due to the different types of Elem . How to combine these parsers?

+4
source share
1 answer

I feel somewhat responsible for answering this question, as I asked and answered Scala: How to combine parser combinators from different objects .

Quick answer: you cannot combine different types of Elem . Another and elegant way to solve this problem is using ^? to increase the regular expression parser with additional filtering.

It may be useful to read Combars Parsing in programming in Scala:

Parser input

Sometimes the analyzer reads a stream of tokens instead of a raw sequence of characters. Then, a separate lexical analyzer is used to convert the raw character stream to the token stream. The type of input parsers is defined as follows:

 type Input = Reader[Elem] 

The Reader class comes from the scala.util.parsing.input package. It is similar to Stream, but it also tracks the positions of all the elements that it reads. The Elem type represents individual input elements. This is an abstract element of type Parsers :

 type Elem 

This means that Parsers subclasses and subheadings must create an Elem class for the type of input elements that are being processed. For example, RegexParsers and JavaTokenParsers fix Elem must be equal to Char .

So, Elem used by a lexical analyzer, which is responsible for grinding your input stream into the smallest possible tokens that the parser wants to cope with. Since you want to deal with regex, your Elem is Char .

But donโ€™t worry. Just because your lexer gives you Char , which doesnโ€™t mean that your parser is stuck with them too. What RegexParsers gives you is an implicit regular expression converter into Parser[String] . You can further convert them using the ^^ operator (fully displays input) and the ^? operator ^? (partially displays input).

Include them in your parsers:

 import scala.util.parsing.combinator._ scala> val dictionary = Map("Foo" -> "x") dictionary: scala.collection.immutable.Map[String,String] = Map(Foo -> x) scala> trait NumbersParsers extends RegexParsers { | def number: Parser[Int] = """\d+""".r ^^ { _.toInt } | } defined trait NumbersParsers scala> trait LookupParsers extends RegexParsers { | def token: Parser[String] = """\w+""".r | def word = | token ^? ({ | case x if dictionary.contains(x) => x | }, { | case s => s + " is not found in the dictionary!" | }) | } defined trait LookupParsers scala> object MyParser extends NumbersParsers with LookupParsers { | def quantitive = number ~ word | | def main(args: Array[String]) { | println(parseAll(quantitive, args(0) )) | } | } defined module MyParser scala> MyParser.main(Array("1 Foo")) [1.6] parsed: (1~Foo) scala> MyParser.main(Array("Foo")) [1.1] failure: string matching regex `\d+' expected but `F' found Foo ^ scala> MyParser.main(Array("2 Bar")) [1.6] failure: Bar is not found in the dictionary! 2 Bar ^ 
+4
source

All Articles