Saving the state in a world of statelessness

Question

Saving the state in a world of statelessness

I convert context-free grammar to normal Greibach form (GNF). The main transformation (from Hopcroft and Ullman) is a sequence of iterations over indexed grammar variables. This is essentially "stateless." I implemented it as a sequence of folds over the corresponding indices (the implementation is pretty simple):

gnf :: Ord a => Set (Rule a) -> Set (Rule a) gnf rl = foldl step1 rl [1..maxIndex rl] where step1 rl' k = foldl step2 rl' [1..k - 1] where step2 rl'' j = noLR k (subst rl'' kj)

maxIndex rl returns the maximum index of the variable in the rule set; subst rl kj performs substitution on k-indexed rules with rules whose right-hand side begins with the variable j-index. After gnf runs, I need to complete the final grammar pass in reverse order.

The noLR problem that converts a grammar with left recursive k-indexed rules. This is a "stateful" function because a unique variable must be generated for each rule (or k-indexed rule) to which noLR applies. So I wrote a stateful function

 noLR :: Ord a => Int -> Set (Rule a) -> State [Sym a] (Set (Rule a)) noLR rl = do (n:ns) <- get; put ns; let rl' = ... remove left recursion rl n ... in return rl'

I can combine noLR sequentially to update n, which noLR takes as a parameter. However, I'm not sure how to perform noLR inside step2 in the above function. I can't seem to use let ... in the schema, because stateful computation is built into several recursive functions.

What I want to do is n be some global variable, similar to explicit streaming n, which I can call and update inside step2, so I originally wrote the function as a fold with eta extension (for n). Does anyone know how I can structure gnf within a state monad to achieve this effect? Except for the last calculation in the summary, nothing else is “in terms of state”, and it’s more convenient for me to use the state monad with “trivial” examples. I am lost.

+7

haskell monads state context-free-grammar

danportin Nov 23 '10 at 19:09

source share

2 answers

I would try rewriting noLR to be clean. Are you sure you cannot rewrite it to create a character that depends only on the name of the rule and its index (or something similar)?

 noLR kj = noLR' kj $ newSymbol kj where newSymbol kj = ... -- some concatenation of k and j noLR' kj sym = ... -- your now pure function

+3

luispedro Nov 23 '10 at 20:51

source share

wolfgang · Accepted Answer · 2010-11-23T22:48:59+0000

To use noLR with the type you specified, you will have to rewrite the gnf function along the following lines:

 gnf :: Ord a => Set (Rule a) -> Set (Rule a) gnf rl = evalState (foldM step1 rl [1..maxIndex rl]) ( ... generate the initial state [Sym a] here ...) where step1 rl' k = foldM step2 rl' [1..k - 1] where step2 rl'' j = noLR k (subst rl'' kj)

Your state variable exists during the whole calculation, and this fact must be explicitly indicated in the code.

If all you need is that the newly created variable names do not collide with each other, you can make noLR clean by creating a new symbol name from the indices k and j - something like "foo_42_16" for k = 42 and j == 16. If the input grammar already contains this type of character name, you might be having problems.

If you want your characters to be unique in grammar, then why not just say that?

 newSymbol :: Set (Rule a) -> Sym a newSymbol rl = ... find a symbol name not used in rl ...

This is definitely inefficient unless you replace Set (Rule a) with another type that allows the newSymbol operation to be performed more efficiently.

Saving the state in a world of statelessness

More articles: