There are several solutions for this, but no one is great.
Method 1- Quickly implement, but not so fast to launch
Well, (according to http://hackage.haskell.org/package/attoparsec-0.10.1.1/docs/Data-Attoparsec-ByteString.html ), attoparsec always backs down, so you can always do something like this -
parseLine1 = do line <- takeTill (== '\n') char '\n' case <some sort of test on line, ie- a regex> of Just -> return <some sort of data type> Nothing -> fail "Parse Error"
then many of these chains will work as expected later
parseLine = parseLine1 <|> parseLine2
The problem with this solution is that, as you can see, you are still doing a bunch of digressions, which can really slow things down.
Method 2- Traditional Method
The usual way to deal with this type is to rewrite the grammar, or in the case of the parser combinator, to move things around so that the complete algorithm needs only one look symbol. This can almost always be done in practice, although sometimes it makes logic much harder to follow ....
For example, suppose you have a grammar creation rule like this -
pet = "dog" | "dolphin"
To do this, you need two view characters before any path is allowed. Instead, you can leave it all as a thing
pet => "ca" halfpet halfpet => "g" | "lphin"
No parallel processing is required, but the grammar is much uglier. (Although I wrote this as a production rule, there is a one-to-one comparison with a similar parser combinator).
Method 3. The correct way, but for the record.
The true way you want to do this is to directly compile the regular expression into a parser combinator ... After compiling any ordinary language, the resulting algorithm always needs only one lookahead character, so the resulting attoparsec code should (for example, the procedure in method 1 to read one character), but the job will be to compile the regular expression.
Regular expression compilation is considered in many textbooks, so I won’t go into details here, but it basically comes down to replacing all the ambiguous paths in the regex state machine with new states. Or, to put it another way, he automatically “left the factors” in all cases that need backtracking.
(I wrote a library that automatically “left factors” in many cases in the context of free grammars, turning almost any contextual free grammar into a linear parser once, but I haven't made it available ... sometime when I cleaned it, I will).