Parsec, how to find "matches" inside a string

How can I use parsec to parse all matching input in a string and discard the rest?

Example: I have a simple number parser, and I can find all numbers if I know what separates them:

num :: Parser Int num = read <$> many digit parse (num `sepBy` space) "" "111 4 22" 

But what if I donโ€™t know what is between the numbers?

 "I will live to be 111 years <b>old</b> if I work out 4 days a week starting at 22." 

many anyChar does not work as a delimiter because it consumes everything.

So how can I get things that correspond to an arbitrary parser surrounded by things that I want to ignore?


EDIT . Note that in a real task, my parser is more complex:

 optionTag :: Parser Fragment optionTag = do string "<option" manyTill anyChar (string "value=") n <- many1 digit manyTill anyChar (char '>') chapterPrefix text <- many1 (noneOf "<>") return $ Option (read n) text where chapterPrefix = many digit >> char '.' >> many space 
+10
source share
4 answers

For an arbitrary parser, myParser is pretty simple:

 solution = many (let one = myParser <|> (anyChar >> one) in one) 

It might be easier to write it like this:

 solution = many loop where loop = myParser <|> (anyChar >> loop) 

Essentially, this defines a recursive parser (called loop ) that will continue to search for the first thing that can be parsed using myParser . many will simply search exhaustively to failure, that is: EOF.

+8
source

you can use

  many ( noneOf "0123456789") 

I'm not sure about the types "noneOf" and "digit", but you can also try

 many $ noneOf digit 
+2
source

To find an item in a string, the item is either at the beginning of the string, or consumes one character and looks for the item in the current short string. If the element does not match the beginning of the line, you will not need to use the characters used in the search, so you will need a try block.

 hasItem = prefixItem <* (many anyChar) preafixItem = (try item) <|> (anyChar >> prefixItem) item = <parser for your item here> 

This code searches for only one occurrence of item in a string.

(AJFarmar almost has it.)

+1
source

The replace-megaparsec package allows you to split a string into sections that match your template and sections that do not match using the sepCap combinator.

 import Replace.Megaparsec import Text.Megaparsec import Text.Megaparsec.Char let num :: Parsec Void String Int num = read <$> many digitChar 
 >>> parseTest (sepCap num) "I will live to be 111 years <b>old</b> if I work out 4 days a week starting at 22." [Left "I will live to be " ,Right 111 ,Left " years <b>old</b> if I work out " ,Right 4 ,Left " days a week starting at " ,Right 22 ,Left "." ] 
0
source

All Articles