I figured out how to fix a type error. The key to a commit type error is to understand the relationship between the Data.ListLike.filter input and the ByteString that is passed to this filter. Here is the Data.ListLike.filter type:
Data.ListLike.filter :: Data.ListLike.Base.ListLike full item => (item -> Bool) -> full -> full
full refers to the stream in the context of the / iteratee enumerator, if I understand it correctly. item refers to a stream item.
Now, if we want to filter a new line in the input file, we need to know the type of stream of the input files and the type of elements in this stream. In this case, the input file is read as a ByteString stream. ByteString is documented as a spatially efficient representation of a Word8 vector. So the item type is Word8.
So, when we write the filter, in the step function, we need to make sure that the Bool operation is defined for Word8, since this is the type of element passed to the filter (as explained above). We are filtering for a new line. Thus, a bool function like the one below, which builds a newline representation of Word8 and checks equality against x of type Word8, should work:
\x -> x == Data.ByteString.Internal.c2w '\n'
There is another missing element - for some reason, the compiler (v7.0.3 Mac) cannot infer the type el in a signature type of type numfile (if anyone has ideas on why this is why, please discuss). So, an explicit indication that Word8 solves the compilation problem:
numlines :: (Monad m, Num a, LL.ListLike s Word8) => Iteratee sma
The full code is below - it compiles and works pretty fast.
{-# LANGUAGE BangPatterns,FlexibleContexts #-} import Data.Iteratee as I import Data.ListLike as LL import Data.Iteratee.IO import Data.ByteString import GHC.Word (Word8) import Data.ByteString.Internal (c2w) numlines :: (Monad m, Num a, LL.ListLike s Word8) => Iteratee sma numlines = liftI $ step 0 where step !i (Chunk xs) = let newline = c2w '\n' in liftI (step $i + fromIntegral (LL.length $ LL.filter (\x -> x == newline) xs)) step !i stream = idone i stream {-# INLINE numlines #-} main = do i' <- enumFile 1024 "/usr/share/dict/words" (numlines :: (Monad m) => Iteratee ByteString m Int) result <- run i' print result {- Time to run on mac OSX: $ time ./test ## above compiled program: ghc --make -O2 test.hs 235886 real 0m0.011s user 0m0.007s sys 0m0.004s $ time wc -l /usr/share/dict/words 235886 /usr/share/dict/words real 0m0.005s user 0m0.002s sys 0m0.002s -}