Haskell Parsec - error messages are less useful when using custom tokens

I am working on the separation of lexing and parsing parser. After some tests, I realized that error messages are less useful when I use tokens other than Parsec Char tokens.

Here are some examples of Parsec error messages when using Char tokens:

ghci> P.parseTest (string "asdf" >> spaces >> string "ok") "asdf wrong" parse error at (line 1, column 7): unexpected "w" expecting space or "ok" ghci> P.parseTest (choice [string "ok", string "nop"]) "wrong" parse error at (line 1, column 1): unexpected "w" expecting "ok" or "nop" 

So, the parser shows which line is expected when an unexpected line is detected, and the selection parser shows which alternatives.

But when I use the same combinators with my tokens:

 ghci> Parser.parseTest ((tok $ Ide "asdf") >> (tok $ Ide "ok")) "asdf " parse error at "test" (line 1, column 1): unexpected end of input 

In this case, it does not print the expected.

 ghci> Parser.parseTest (choice [tok $ Ide "ok", tok $ Ide "nop"]) "asdf " parse error at (line 1, column 1): unexpected (Ide "asdf","test" (line 1, column 1)) 

And when I use choice , it does not print alternatives.

I expect this behavior to be due to combinatorial functions, not tokens, but it looks like I'm wrong. How can i fix this?

Here is the full lexir + parser code:

Lexer:

 module Lexer ( Token(..) , TokenPos(..) , tokenize ) where import Text.ParserCombinators.Parsec hiding (token, tokens) import Control.Applicative ((<*), (*>), (<$>), (<*>)) data Token = Ide String | Number String | Bool String | LBrack | RBrack | LBrace | RBrace | Keyword String deriving (Show, Eq) type TokenPos = (Token, SourcePos) ide :: Parser TokenPos ide = do pos <- getPosition fc <- oneOf firstChar r <- optionMaybe (many $ oneOf rest) spaces return $ flip (,) pos $ case r of Nothing -> Ide [fc] Just s -> Ide $ [fc] ++ s where firstChar = ['A'..'Z'] ++ ['a'..'z'] ++ "_" rest = firstChar ++ ['0'..'9'] parsePos p = (,) <$> p <*> getPosition lbrack = parsePos $ char '[' >> return LBrack rbrack = parsePos $ char ']' >> return RBrack lbrace = parsePos $ char '{' >> return LBrace rbrace = parsePos $ char '}' >> return RBrace token = choice [ ide , lbrack , rbrack , lbrace , rbrace ] tokens = spaces *> many (token <* spaces) tokenize :: SourceName -> String -> Either ParseError [TokenPos] tokenize = runParser tokens () 

Parser:

 module Parser where import Text.Parsec as P import Control.Monad.Identity import Lexer parseTest :: Show a => Parsec [TokenPos] () a -> String -> IO () parseTest ps = case tokenize "test" s of Left e -> putStrLn $ show e Right ts' -> P.parseTest p ts' tok :: Token -> ParsecT [TokenPos] () Identity Token tok t = token show snd test where test (t', _) = case t == t' of False -> Nothing True -> Just t 

DECISION:

Well, after fp4me responds and reads the Parsec Char source more carefully, I ended up with this:

 {-# LANGUAGE FlexibleContexts #-} module Parser where import Text.Parsec as P import Control.Monad.Identity import Lexer parseTest :: Show a => Parsec [TokenPos] () a -> String -> IO () parseTest ps = case tokenize "test" s of Left e -> putStrLn $ show e Right ts' -> P.parseTest p ts' type Parser a = Parsec [TokenPos] () a advance :: SourcePos -> t -> [TokenPos] -> SourcePos advance _ _ ((_, pos) : _) = pos advance pos _ [] = pos satisfy :: (TokenPos -> Bool) -> Parser Token satisfy f = tokenPrim show advance (\c -> if fc then Just (fst c) else Nothing) tok :: Token -> ParsecT [TokenPos] () Identity Token tok t = (Parser.satisfy $ (== t) . fst) <?> show t 

Now I get the same error messages:

ghci> Parser.parseTest (select [tok $ Ide "ok", tok $ Ide "nop"]) "asdf"
parse error in (row 1, column 1):
unexpected (Ide "asdf", "test" (row 1, column 3))
expecting Ide to be ok or Ide to nop

+6
source share
1 answer

The beginning of the solution may be to define your choice function in Parser, use a special unexpected function to override the unexpected error, and finally use the <?> Operator to override the waiting message:

 mychoice [] = mzero mychoice (x:[]) = (tok x <|> myUnexpected) <?> show x mychoice (x:xs) = ((tok x <|> mychoice xs) <|> myUnexpected) <?> show (x:xs) myUnexpected = do input <- getInput unexpected $ (id $ first input ) where first [] = "eof" first (x:xs) = show $ fst x 

and call your parser as follows:

 ghci> Parser.parseTest (mychoice [Ide "ok", Ide "nop"]) "asdf " parse error at (line 1, column 1): unexpected Ide "asdf" expecting [Ide "ok",Ide "nop"] 
+5
source

Source: https://habr.com/ru/post/924004/


All Articles