Monad for creating test data

OK, so I'm trying to write a monad to create test data, but I can't get it to work the way I want it. It looks something like this:

runBuildM :: [i] -> BuildM iox -> [o] -- Given a list of i, build a list of o. source :: BuildM ioi -- Fetch unique i. yield :: o -> BuildM io () -- Return a new o to the caller. gather :: BuildM iox -> BuildM ioo -- Fetch every possible o from sub-computation. local :: BuildM iox -> BuildM iox -- Isolate any source invocations from the rest of the code. 

In other words, it is a food monad, monk monk and monad. The idea is that I can write something like this:

 build_tests depth = do local $ do v <- source yield v yield (map toLower v) yield "[]" yield "()" when (depth > 2) $ do t1 <- gather $ build_tests (depth-1) yield $ "(" ++ t1 ++ ")" yield $ "[" ++ t1 ++ "]" t2 <- gather $ build_tests (depth-1) yield $ "(" ++ t1 ++ "," ++ t2 ++ ")" 

The idea is to generate all possible combinations of data. You can do this simply by using lists, but the result ends up syntactically horrible. This is more readable. Unfortunately, this does not actually work ...

The problem seems to boil down to the fact that the local function is not working properly. It is assumed that any calls to source in the subcomputation will not have an effect outside of it. (That is, subsequent calls to source from outside the local block again get the first token.) However, what my local implementation actually does is reset the next token for everything (that is, including the contents of the sub-calculation). This is clearly wrong, but I can’t let my mind bend my life around how to make it work correctly.

The fact that I had such a big problem with the fact that the code works as needed probably means that the actual internal representation of my monad is simply wrong. Can anyone take a hit on its proper implementation?


EDIT: Maybe I figured it out, but in fact I did not indicate the expected result that I am trying to get. The above code should do the following:

 ["A", "a", "[]", "()", "(A)", "(a)", "[A]", "[a]", "(A, B)", "(A, b)", "(a, B)", "(a, b)"] 

It is not supercritical that the results are displayed in that order. I would like single cases to appear in front of complex ones, but I didn't fuss too much in the order in which the connections appear. The rule is that the same variable never appears twice in any single expression.

If the depth is deeper, we will also get terms such as

 "((A))", "([A])", "[(A)]", "((A, B), C)", "(A, (B, C))" 

etc.


It is clearly broken, but here's what I still have:

 newtype BuildM iox = BuildM ([i] -> SEQ.Seq ([i], SEQ.Seq o, x)) instance Functor (BuildM io) where fmap uf (BuildM sf) = BuildM $ \ is0 -> do (is1, os, x) <- sf is0 return (is1, os, uf x) instance Applicative (BuildM io) where pure x = BuildM $ \ is0 -> return (is0, SEQ.empty, x) BuildM sf1 <*> BuildM sf2 = BuildM $ \ is1 -> do (is2, os2, f) <- sf1 is1 (is3, os3, x) <- sf2 is2 return (is3, os2 >< os3, fx) instance Monad (BuildM io) where return = pure BuildM sf1 >>= uf = BuildM $ \ is1 -> do (is2, os2, x) <- sf1 is1 let BuildM sf2 = uf x (is3, os3, y) <- sf2 is2 return (is3, os2 >< os3, y) runBuildM :: [i] -> BuildM iox -> [o] runBuildM is0 (BuildM sf) = toList $ do (is, os, x) <- sf is0 os source :: BuildM ioi source = BuildM $ \ is -> if null is then error "AHC.Tests.TestBuilder.source: end of input" else return (tail is, SEQ.empty, head is) yield :: o -> BuildM io () yield o = BuildM $ \ is -> return (is, SEQ.singleton o, () ) gather :: BuildM iox -> BuildM io' o gather (BuildM sf1) = BuildM $ \ is1 -> do (is2, os2, _) <- sf1 is1 o <- os2 return (is2, SEQ.empty, o) local :: BuildM iox -> BuildM io () local (BuildM sf1) = BuildM $ \ is1 -> let os = do (is2, os2, x) <- sf1 is1; os2 in return (is1, os, () ) 
+6
source share
3 answers

You are trying to rethink pipes and some syntax for creating lists . The problem is much simpler than how you described it. The source of strings can be completely separate from constructing structures.

You want to create structures that draw characters from some source. Without caring about the source, let me build the structures. Each structure is a Pipe that will draw from some source line and yield to combine together to build an expression.

 import Data.Char import Data.Functor.Identity import Pipes.Core import Pipes ((>->)) import qualified Pipes as P import qualified Pipes.Prelude as P build_structures :: Int -> [Pipe String String Identity ()] build_structures depth = gather $ do yield $ P.take 1 yield $ P.map (map toLower) >-> P.take 1 when (depth > 2) $ do t1 <- lift $ build_structures (depth - 1) yield $ P.yield "(" >> t1 >> P.yield ")" yield $ P.yield "[" >> t1 >> P.yield "]" t2 <- lift $ build_structures (depth - 1) yield $ P.yield "(" >> t1 >> P.yield "," >> t2 >> P.yield ")" 

This code uses the ContT tag from the continue response.

We run one of these structures, giving it characters and concatenating the results.

 run :: Pipe String String Identity () -> String run p = concat . P.toList $ P.each symbols >-> p -- an infinite source of unique symbols symbols :: [String] symbols = drop 1 symbols' where symbols' = [""] ++ do tail <- symbols' first <- ['A'..'Z'] return (first : tail) 

Examples lead to the desired lines. I will leave two special cases "[]" and "()" that do not appear in recursive terms as an exercise.

 import Data.Functor main = do putStrLn "Depth 2" print $ run <$> build_structures 2 putStrLn "Depth 3" print $ run <$> build_structures 3 putStrLn "Depth 4" print $ run <$> build_structures 4 

The result is

 Depth 2 ["A","a"] Depth 3 ["A","a","(A)","[A]","(A,B)","(A,b)","(a)","[a]","(a,B)","(a,b)"] Depth 4 ["A","a","(A)","[A]","(A,B)","(A,b)","(A,(B))","(A,[B])","(A,(B,C))","(A,(B,c))","(A,(b))","(A,[b])","(A,(b,C))","(A,(b,c))","(a)","[a]","(a,B)","(a,b)","(a,(B))","(a,[B])","(a,(B,C))","(a,(B,c))","(a,(b))","(a,[b])",... 
+2
source

You are trying to rethink pipes . Your source and yield are await and yield pipes. The other two problems you are trying to handle are ReaderT and WriterT respectively. If you put the entire list of inputs into the ReaderT environment, you can run the local subcategories that start at the beginning of the list. You can collect all the results from the count by adding a WriterT layer to collect the result.

For good syntax, with gather you are trying to recreate ListT .

Pipes, Readers, and Writers

We will use all of the following in a very short order.

 import Data.Functor.Identity import Data.Foldable import Control.Monad import Control.Monad.Morph import Control.Monad.IO.Class import Control.Monad.Trans.Class import Control.Monad.Trans.Reader hiding (local) import Control.Monad.Trans.Writer.Strict import Pipes.Core import Pipes ((>->)) import qualified Pipes as P import qualified Pipes.Prelude as P import Pipes.Lift (runWriterP, runReaderP) 

Your builder is Pipe io on top of Reader [i] , which allows you to reset at the beginning of input. We will define two versions of it: BuildT , which is a monad transformer, and BuildM , which is a monad. BuildM is a transformer applied to Identity .

 type BuildT eiomr = Pipe io (ReaderT em) r type BuildM eior = BuildT eio Identity r 

local starts the builder, which gives it all the input read from the environment. We might want to give it a different name to avoid a conflict with local defined for ReaderT

 local :: (Monad m, Foldable f) => BuildT (fi) iom () -> Proxy a' a () o (ReaderT (fi) m) () local subDef = do e <- lift ask hoist lift $ runReaderP e $ P.each e >-> subDef 

To collect the results of auxiliary calculations, we will take advantage of the fact that the pipes are so clean that you can change the original monad if you have a natural forall x. mx -> nx transformation forall x. mx -> nx forall x. mx -> nx . Proxies from pipes have an MFunctor instance that provides the function hoist :: (forall x. mx -> nx) -> Proxy a' ab' bmr -> Proxy a' ab' bnr ; it allows us to lift all the basic monad operations under the pipe to use the pipe over another transformer, in this case WriterT .

 collect :: (Monad m) => Proxy a' a () bmr -> Proxy a' ac' cm ([b], r) collect subDef = do (r, w) <- runWriterP $ hoist lift subDef //> \x -> lift $ tell (++[x]) return (w [], r) 

To start the builder, we download all the input from the environment, provide the initial environment, collect the results and run the entire channel.

 runBuildT :: (Monad m) => [i] -> BuildT [i] iom () -> m [o] runBuildT e = runEffect . fmap fst . collect . runReaderP e . local 

Running a monad instead of a transformer is easy

 runBuildM :: [i] -> BuildM [i] io () -> [o] runBuildM e = runIdentity . runBuildT e 

Listt

In this section, we can use do -notation to create all combinations of things. This is equivalent to using for pipes instead of every >>= and yield instead of every return .

The syntax is that gather all the sub- ListT results reinvent the ListT . a ListT ma contains a Producer am () , which returns only data downstream. Pipes that receive data from upstream and downstream data do not fit into Producer bm () . It takes a little conversion.

We can convert a Proxy that has both an upstream and downstream interface into one, only with an upstream interface wrapped around another proxy with an upstream interface. To do this, we will raise the main monad to our new internal proxy server, and then replace all request in the external proxy server below the request removed from the internal proxy server.

 floatRespond :: (Monad m) => Proxy a' ab' bmr -> Proxy c' cb' b (Proxy a' ad' dm) r floatRespond = (lift . request >\\) . hoist lift 

They can be converted to ListT . We will discard any returned data to get a more polymorphic type.

 gather :: (Monad m) => Proxy a' a () bmr -> P.ListT (Proxy a' ac' cm) b gather = P.Select . floatRespond . (>>= return . const ()) 

Using ListT little cumbersome to use; you need mplus between return to get both outputs. It is often convenient to translate the proxy server into ListT so you can lift . yield lift . yield instead of return ing. We are going to drop all our ListT results, relying on the result coming from lift . yield. lift . yield. enumerate just runs a ListT` wrapped around something, discarding all results

 enumerate = P.runListT 

Example

Now we are ready to write and run your example. I want to say that for source you need to get one value from the source, and for yield - one value. If you do not need to get values ​​one at a time, your question is too high, and this answer is redundant.

 source = P.await yield = P.yield 

In the example where we use gather to create lists, we run this piece of code with enumerate and get the results with lift . yield lift . yield .

 import Data.Char build_tests :: Monad m => Int -> BuildT [String] String String m () build_tests depth = do local $ do v <- source yield $ v yield $ (map toLower v) yield "[]" yield "()" when (depth > 2) $ enumerate $ do t1 <- gather $ build_tests (depth-1) lift . yield $ "(" ++ t1 ++ ")" lift . yield $ "[" ++ t1 ++ "]" t2 <- gather $ build_tests (depth-1) lift . yield $ "(" ++ t1 ++ "," ++ t2 ++ ")" 

If we run this example with the input ["A", "B"] , the input "B" is never used, because source used only once inside each local .

 main = do putStrLn "Depth 2" print =<< runBuildT ["A", "B"] (build_tests 2) putStrLn "Depth 3" print =<< runBuildT ["A", "B"] (build_tests 3) 

The output for depths less than 4 is small enough to repeat here.

 ["A","a","[]","()"] Depth 3 ["A","a","[]","()","(A)","[A]","(A,A)","(A,a)","(A,[])","(A,())","(a)","[a]","(a,A)","(a,a)","(a,[])","(a,())","([])","[[]]","([],A)","([],a)","([],[])","([],())","(())","[()]","((),A)","((),a)","((),[])","((),())"] 

It may be excessive.

I suspect you could mean source to get everything from the source.

 source = gather P.cat yield = P.yield 

If we use this as an example instead of getting a single element from the source, we will enumerate first local block and get the results return ing in ListT .

 build_tests :: Monad m => Int -> BuildT [String] String String m () build_tests depth = do local $ enumerate $ do v <- source lift . yield $ v lift . yield $ (map toLower v) yield "[]" yield "()" when (depth > 2) $ enumerate $ do t1 <- gather $ build_tests (depth-1) lift . yield $ "(" ++ t1 ++ ")" lift . yield $ "[" ++ t1 ++ "]" t2 <- gather $ build_tests (depth-1) lift . yield $ "(" ++ t1 ++ "," ++ t2 ++ ")" 

This uses both source values ​​when we run the two-source example.

 Depth 2 ["A","a","B","b","[]","()"] Depth 3 ["A","a","B","b","[]","()","(A)","[A]","(A,A)","(A,a)","(A,B)","(A,b)","(A,[])","(A,())","(a)","[a]","(a,A)","(a,a)","(a,B)","(a,b)","(a,[])","(a,())","(B)","[B]","(B,A)","(B,a)","(B,B)","(B,b)","(B,[])","(B,())","(b)","[b]","(b,A)","(b,a)","(b,B)","(b,b)","(b,[])","(b,())","([])","[[]]","([],A)","([],a)","([],B)","([],b)","([],[])","([],())","(())","[()]","((),A)","((),a)","((),B)","((),b)","((),[])","((),())"] 

If you never get a single value from the source, you can simply use ListT (ReaderT [i] m) o . You might still need a proxy server to avoid clutter with mplus .

+3
source

If my other answer is redundant, the monad continuation transformer provides a convenient way to build any MonadPlus .

The continuation monad allows us to easily capture the idea of ​​making something mplus an unknown remainder.

 import Control.Monad import Control.Monad.Trans.Cont once :: MonadPlus m => ma -> ContT am () once m = ContT $ \k -> m `mplus` k () 

Returning a result simply returns it once.

 yield :: MonadPlus m => a -> ContT am () yield = once . return 

We can collect all the results by inserting mzero at the end.

 gather :: MonadPlus m => ContT amr -> ma gather m = runContT m (const mzero) 

Your example is written in terms of yield , gather , once and lift .

 import Data.Char import Control.Monad.Trans.Class build_tests :: MonadPlus m => m String -> Int -> ContT String m () build_tests source = go where go depth = do once . gather $ do v <- lift source yield v yield (map toLower v) yield "[]" yield "()" when (depth > 2) $ do t1 <- lift . gather $ go (depth-1) yield $ "(" ++ t1 ++ ")" yield $ "[" ++ t1 ++ "]" t2 <- lift . gather $ go (depth-1) yield $ "(" ++ t1 ++ "," ++ t2 ++ ")" main = print . gather $ build_tests ["A", "B"] 3 

Outputs the following:

 Depth 2 ["A","a","B","b","[]","()"] Depth 3 ["A","a","B","b","[]","()","(A)","[A]","(A,A)","(A,a)","(A,B)","(A,b)","(A,[])","(A,())","(a)","[a]","(a,A)","(a,a)","(a,B)","(a,b)","(a,[])","(a,())","(B)","[B]","(B,A)","(B,a)","(B,B)","(B,b)","(B,[])","(B,())","(b)","[b]","(b,A)","(b,a)","(b,B)","(b,b)","(b,[])","(b,())","([])","[[]]","([],A)","([],a)","([],B)","([],b)","([],[])","([],())","(())","[()]","((),A)","((),a)","((),B)","((),b)","((),[])","((),())"] 

I took the liberty of getting rid of the requirement to read the original source from the environment for simplicity. You can add ReaderT to the transformer stack to return it. I also did not select the transfomer list for you, the example is performed using the usual list monad. Since it is written in terms of MonadPlus , it will work for any (MonadTrans t, MonadPlus (tm)) => tm .

+2
source

All Articles