How to implement a Read instance with the actual parsing already completed, in Haskell?

Question

How to implement a Read instance with the actual parsing already completed, in Haskell?

In this answer , you can find the statement that it would not be very difficult to implement the Read instance for the Tree data type, with the actual parsing already completed.

However, it’s hard for me to understand how a function like Read : AFAIK works, I have to implement the readsPrec function instead of Read , and this readsPrec should do the reading for strings consisting of just one character. It is right?
If so, how should the Read Tree instance be implemented when parsing is done through ParsecT ? Can we break a word by word, or is there a need for this?

I did not think of Read as a complex function in Haskell, but now I find it rather puzzled and confusing, and I get lost in the Hoogle search for all such unfamiliar things as readP_to_S , readS , etc.

Any help or link is appreciated.

+5

haskell

awllower Aug 7 '16 at 15:59

source share

1 answer

Elias Riedel Gårding · Accepted Answer · 2016-08-07T18:31:15+0000

I was interested in this for some time, and your question prompted me to look into it.

Summary: The easiest way to do this manually:

 instance Read Tree where readsPrec _ str = [(parsecRead str,"")]

But deriving is a safer option; above does not work with [Tree] and other data types. I understand that Show and Read not meant to be done manually ; they must be obtained and work on syntactically valid Haskell expressions .

It seems that the reason why Read not as simple as

 class Read a where read :: String -> a

lies in the fact that there is a system of combinatorial combinators, similar, but different from Parsec, which was modular, recursive, etc. But since we already use another library of parser combinators, Parsec, I would think that it is best to communicate with another system as little as possible.

The Prelude documentation states that the minimum complete implementation of Read is readsPrec or readPrec . The latter is described as "A proposed replacement for readsPrec using new-style parsers (GHC only)." This smells like trouble for me, so release with readsPrec .

A type

 readsPrec :: Read a => Int -> ReadS a type ReadS a = String -> [(a,String)]

and the documentation for ReadS reads "A parser for type a , represented as a function that takes a String and returns a list of possible parses as (a,String) pairs." For me, it’s not clear what parsing is, but looking at the source for Read in Text.Read shows

 read :: Read a => String -> a read s = either errorWithoutStackTrace id (readEither s) readEither :: Read a => String -> Either String a readEither s = -- minPrec is defined as 0 in Text.ParserCombinators.ReadPrec case [ x | (x,"") <- readPrec_to_S read' minPrec s ] of [x] -> Right x [] -> Left "Prelude.read: no parse" _ -> Left "Prelude.read: ambiguous parse" where read' = -- read' :: P.ReadPrec a do x <- readPrec lift P.skipSpaces -- P is Text.ParserCombinators.ReadP return x

I tried to expand the definitions of readPrec_to_S etc., but I felt it was not worth it. I think it’s clear in the definition that we should return [(x,"")] as a successful parsing.

The whole argument in readsPrec appears to be a "priority context". I assume it is safe to ignore it if we just want to parse one tree at a time, but ignoring it will lead to problems later if we try to parse [Tree] instances, for example. I will ignore it because I do not think it is worth it.

In short, if we have parsecRead :: String -> Tree as defined in the message you are referring to (the author called it read' )

 instance Read Tree where readsPrec _ str = [(parsecRead str,"")]

If we check how this works in the program (using the Show instance that provided the initial request):

 main = do print (read "ABC(DE)F" == example) print ([read "ABC(DE)F", read "ABC(DE)F"] :: [Tree]) print (read "[ABC(DE)F,ABC(DE)F]" :: [Tree])

we get

 True [ABC(DE)F,ABC(DE)F] Test.hs: Prelude.read: no parse

The complexity and lack of documentation here really makes me think that deriving (Read) is actually the only safe option if you don't want to dive into the details of priority levels. I think I read somewhere that Show and Read are actually basically intended for output , and that the strings should be Haskell syntactically correct expressions (please correct me if I am wrong). For a broader parsing library, for example, Parsec is probably the best option.

If you have the energy to peer into the source code yourself, the corresponding modules look like

How to implement a Read instance with the actual parsing already completed, in Haskell?

More articles: