Parsec using between parse parens

If I wanted to parse a line with several groups in brackets into a list of lines containing each group, for example

"((abc) abc)" 

in

 ["((abc) abc)","( abc)"] 

How can I do this using parsec? Using between looks good, but it is not possible to separate it with a start and end value.

+4
source share
2 answers

Although jozefg's solution is almost identical to what I came up with (and I completely agree with all his suggestions), there are some slight differences that make me think that I should write a second answer:

  • Due to the expected result of the original example, there is no need to treat the space-separated parts as separate subtrees.
  • Further, it may be interesting to see the part that actually computes the expected results (i.e. a list of strings).

So here is my version. As jozefg has already suggested, divide the task into several subtasks. It:

  • Parse a string in an algebraic data type representing a tree.
  • Collect the (desired) subtrees of this tree.
  • Turn the trees into rows.

As for 1, we first need a tree data type

 import Text.Parsec import Text.Parsec.String import Control.Applicative ((<$>)) data Tree = Leaf String | Node [Tree] 

and then a function that can parse strings into values โ€‹โ€‹of this type.

 parseTree :: Parser Tree parseTree = node <|> leaf where node = Node <$> between (char '(') (char ')') (many parseTree) leaf = Leaf <$> many1 (noneOf "()") 

In my version, I consider the string of holes between the brackets as a Leaf node (i.e. I do not break into white spaces).

Now we need to collect the subtrees of the tree we are interested in:

 nodes :: Tree -> [Tree] nodes (Leaf _) = [] nodes t@ (Node ts) = t : concatMap nodes ts 

Finally, a Show -nstance for Tree allows us to turn them into strings.

 instance Show Tree where showsPrec d (Leaf x) = showString x showsPrec d (Node xs) = showString "(" . showList xs . showString ")" where showList [] = id showList (x:xs) = shows x . showList xs 

Then the original problem can be solved, for example, by:

 parseGroups :: Parser [String] parseGroups = map show . nodes <$> parseTree > parseTest parseGroups "((abc) abc)" ["((abc) abc)","(abc)"] 
+9
source

I would use a recursive parser:

 data Expr = List [Expr] | Term String expr :: Parsec String () Expr expr = recurse <|> terminal 

where terminal are your primitives, in this case they seem to be strings of characters, therefore

  where terminal = Term <$> many1 letter 

and recurse is

  recurse = List <$> (between `on` char) '(' ')' (expr `sepBy1` char ' ') 

Now we have a beautiful Expr tree that we can assemble with

 collect r@ (List ts) = r : concatMap collect ts collect _ = [] 
+10
source

All Articles