Scala: using StandardTokenParser to parse hexadecimal numbers

I am using the Scala combinational parser, extending scala.util.parsing.combinator.syntactical.StandardTokenParser . This class provides the following methods

def ident : Parser[String] for parsing identifiers and

def numericLit : Parser[String] to parse a number (suppose a decimal number)

I am using scala.util.parsing.combinator.lexical.Scanners from scala.util.parsing.combinator.lexical.StdLexical for lexing.

My requirement is to parse a hexadecimal number (without the 0x prefix), which can be of any length. Basically a grammar like: ([0-9]|[af])+

I tried to integrate the Regex parser, but there are type problems there. Other ways to extend the definition of the lexer delimiter and grammar rules make the token not found!

+6
scala parsing
source share
2 answers

As I thought, the problem can be solved by expanding the behavior of Lexer, not Parser. The standard lexer only accepts decimal digits, so I created a new lexer:

 class MyLexer extends StdLexical { override type Elem = Char override def digit = ( super.digit | hexDigit ) lazy val hexDigits = Set[Char]() ++ "0123456789abcdefABCDEF".toArray lazy val hexDigit = elem("hex digit", hexDigits.contains(_)) } 

And my parser (which should be StandardTokenParser) can be expanded as follows:

 object ParseAST extends StandardTokenParsers{ override val lexical:MyLexer = new MyLexer() lexical.delimiters += ( "(" , ")" , "," , "@") ... } 

The construction of a "number" of numbers is performed according to the StdLexical class:

 class StdLexical { ... def token: Parser[Token] = ... | digit~rep(digit)^^{case first ~ rest => NumericLit(first :: rest mkString "")} } 

Since StdLexical only gives the parsed number as a String, this is not a problem for me, as I am not interested in a numerical value either.

+5
source share

You can use RegexParsers with the action associated with the token in question.

 import scala.util.parsing.combinator._ object HexParser extends RegexParsers { val hexNum: Parser[Int] = """[0-9a-f]+""".r ^^ { case s:String => Integer.parseInt(s,16) } def seq: Parser[Any] = repsep(hexNum, ",") } 

This will define a parser that reads a comma-separated hexadecimal number without a leading 0x . And he really will return Int .

 val result = HexParser.parse(HexParser.seq, "1, 2, f, 10, 1a2b34d") scala> println(result) [1.21] parsed: List(1, 2, 15, 16, 27439949) 

There is no way to distinguish decimal notation numbers. I also use Integer.parseInt , this is limited by the size of your Int . To get any length, you may need to make your own parser and use BigInteger or arrays.

+3
source share

All Articles