Parsec-Parser works well, but can it be done better?

I am trying to do this:

Parse the text in the form:

Some text # {0,0,0} some text # {0,0,0} # {0,0,0} more text # {0,0,0}

to a list of some data structure:

[Inside "Some Text", Outside (0,0,0), Inside "some Text", Outside (0,0,0), Outside (0,0,0), Inside "more Text", Outside (0, 0,0)]

So these # {a, b, c} bits must turn into different things as the rest of the text.

I have this code:

module ParsecTest where import Text.ParserCombinators.Parsec import Monad type Reference = (Int, Int, Int) data Transc = Inside String | Outside Reference deriving (Show) text :: Parser Transc text = do x <- manyTill anyChar ((lookAhead reference) <|> (eof >> return (Inside ""))); return (Inside x) transc = reference <|> text alot :: Parser [Transc] alot = do manyTill transc eof reference :: Parser Transc reference = try (do{ char '#'; char '{'; a <- number; char ','; b <- number; char ','; c <- number; char '}'; return (Outside (a,b,c)) }) number :: Parser Int number = do{ x <- many1 digit; return (read x) } 

This works as expected. You can check it in ghci by typing

parseTest alot "Some Text # {0,0,0} some Text # {0,0,0} # {0,0,0} more Text # {0,0,0}"

But I think this is not nice.

1) Is it really necessary to use lookAhead for my problem?

2) Is return (Inside "") ugly hack?

3) Is there a generally more concise / smarter way to archive the same?

+4
source share
1 answer

1) I think you need lookAhead , since you need the result of this parsing. It would be nice to avoid running this analyzer twice by pointing Parser (Transc,Maybe Transc) to Inside with the optional Outside following. If performance is a problem, then it is worth doing.

2) Yes.

3) Applicative s

 number2 :: Parser Int number2 = read <$> many1 digit text2 :: Parser Transc text2 = (Inside .) . (:) <$> anyChar <*> manyTill anyChar (try (lookAhead reference2) *> pure () <|> eof) reference2 :: Parser Transc reference2 = ((Outside .) .) . (,,) <$> (string "#{" *> number2 <* char ',') <*> number2 <*> (char ',' *> number2 <* char '}') transc2 = reference2 <|> text2 alot2 = many transc2 

You can rewrite the beginning of reference2 with a helper, for example aux xyz = Outside (x,y,z) .

EDIT: Changed text to handle inputs that do not end with Outside .

+5
source

All Articles