For processing tagged input, I would use the enumerator package.
import Data.Enumerator import Data.Enumerator.Binary (enumFile)
We use bytestrings
import Data.ByteString as BS
and IO
import Control.Monad.Trans (liftIO) import Control.Monad (mapM_) import System (getArgs)
Your main function might look like this:
main = do (filepath:_) <- getArgs let destination run_ $ enumFile filepath $$ writeFile (filepath ++ ".cpy")
enumFile reads 4096 bytes per piece and passes them to writeFile, which writes it.
enumWrite is defined as:
enumWrite :: FilePath -> Iteratee BS.ByteString IO () enumWrite filepath = do liftIO (BS.writeFile filepath BS.empty) -- ensure the destination is empty continue step where step (Chunks xs) = do liftIO (mapM_ (BS.appendFile filepath) xs) continue step step EOF = yield () EOF
As you can see, the step function takes pieces of bytestrings and adds them to the destination file. These pieces are of type Stream BS.Bytestring, where Stream is defined as:
data Stream a = Chunks [a] | EOF
In step EOF completes, yielding ().
To do a more detailed reading, I personally recommend Michael Snoymans tutorial
Figures
$ time ./TestCopy 5MB ./TestCopy 5MB 2,91s user 0,32s system 96% cpu 3,356 total $ time ./TestCopy2 5MB ./TestCopy2 5MB 0,04s user 0,03s system 93% cpu 0,075 total
It has completely improved. Now, to realize your fold, you probably want to write Enumeratee, which is used to convert the input stream. Fortunately, there is already a display function defined in the enumerator package that can be changed for your needs, i.e. It can be changed to transfer state.
On the construction of an intermediate result
You build the wordList in reverse order and then change it. I think difference lists work better because appends only accept O (1) time due to the fact that adding is only a function. I'm not sure they take up more space. Here is an approximate sketch of difference lists:
type DList a = [a] -> [a] emptyList :: DList a emptyList = id snoc :: DList a -> a -> DList a snoc dlist a = dlist . (a:) toList :: DList a -> [a] toList dlist = dlist []
This answer is probably no longer needed, but I added it for completeness.