Parsing many blocks with foldLine

For this simplified task, I am trying to parse an input that looks like

foo bar
 baz quux 
 woo
hoo xyzzy 
  glulx

at

[["foo", "bar", "baz", "quux", "woo"], ["hoo", "xyzzy", "glulx"]]

The code I tried looks like this:

import qualified Text.Megaparsec.Lexer as L
import Text.Megaparsec hiding (space)
import Text.Megaparsec.Char hiding (space)
import Text.Megaparsec.String
import Control.Monad (void)
import Control.Applicative

space :: Parser ()
space = L.space (void spaceChar) empty empty

item :: Parser () -> Parser String
item sp = L.lexeme sp $ some letterChar

items :: Parser () -> Parser [String]
items sp = L.lineFold sp $ \sp' -> some (item sp')

items_ :: Parser [String]
items_ = items space

This works for a single block items:

λ» parseTest items_ "foo bar\n baz quux\n woo"
["foo","bar","baz","quux","woo"]

But as soon as I try to parse many items, it fails on the first inappropriate line:

λ» parseTest (many items_) "foo bar\n baz quux\n woo\nhoo xyzzy\n  glulx"
4:1:
incorrect indentation (got 1, should be greater than 1)

or with even simpler input:

λ» parseTest (many items_) "a\nb"
2:1:
incorrect indentation (got 1, should be greater than 1)
+3
source share
1 answer

Megaparsec :-) , , Megaparsec - , lexer "" . , , "". sp' , , , , , .

:

, . , . - , - . , .

sc = L.space (void spaceChar) empty empty

myFold = L.lineFold sc $ \sc' -> do
  L.symbol sc' "foo"
  L.symbol sc' "bar"
  L.symbol sc  "baz" -- for the last symbol we use normal space consumer

, , , , . , . "" :

space :: Parser ()
space = L.space (void spaceChar) empty empty

item :: Parser String
item = some letterChar

items :: Parser () -> Parser [String]
items sp = L.lineFold sp $ \sp' ->
  item `sepBy1` try sp' <* sp

items_ :: Parser [String]
items_ = items space

item `sepBy1` try sp' , , sp , .

λ> parseTest items_ "foo bar\n baz quux\n woo"
["foo","bar","baz","quux","woo"]
λ> parseTest (many items_) "foo bar\n baz quux\n woo\nhoo xyzzy\n  glulx"
[["foo","bar","baz","quux","woo"],["hoo","xyzzy","glulx"]]
λ> parseTest (many items_) "foo bar\n baz quux\n woo\nhoo\nxyzzy\n  glulx"
[["foo","bar","baz","quux","woo"],["hoo"],["xyzzy","glulx"]]
+4

All Articles