Why does attoparsec use 100 times more memory than my input file?

Question

Why does attoparsec use 100 times more memory than my input file?

I have a 2.5 MB file full of floats separated by spaces (the code can generate it for you) and want to attoparsec it in an array using attoparsec .

It is surprisingly slow, takes almost a second and allocates a lot of memory:

 time ./Attoparsec-problem +RTS -sstderr < floats.txt 299999.0 956,647,344 bytes allocated in the heap 752,875,520 bytes copied during GC 166,485,416 bytes maximum residency (7 sample(s)) 8,874,168 bytes maximum slop 337 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 1604 colls, 0 par 0.21s 0.27s 0.0002s 0.0092s Gen 1 7 colls, 0 par 0.24s 0.36s 0.0520s 0.1783s ... %GC time 65.5% (75.1% elapsed) Alloc rate 3,985,781,488 bytes per MUT second Productivity 34.5% of total user, 28.6% of total elapsed

My parser is incredibly simple: it is essentially double <* skipSpace .

This is the code:

 import Control.Applicative import Data.Attoparsec.ByteString.Char8 as A import qualified Data.ByteString as BS import qualified Data.Vector.Unboxed as U -- Compile with: -- ghc --make -O2 -prof -auto-all -caf-all -rtsopts -fforce-recomp Attoparsec-problem.hs -- Run: -- time ./Attoparsec-problem +RTS -sstderr < floats.txt main :: IO () main = do -- For creating the test file (2.5 MB) -- writeFile "floats.txt" (Prelude.unwords $ Prelude.map show [1.0,2.0..300000.0]) r <- parse parser <$> BS.getContents case r of Done _ arr -> print $ U.last arr x -> print x where parser = do U.replicateM (300000-1) (double <* skipSpace) -- This gives surprisingly bad productivity (70% GC time) and 180 MB max residency -- for a 2.5 MB file!

Can you explain to me what is going on?

+4

haskell attoparsec

nh2 Feb 17 '13 at 22:50

source share

No one has answered this question yet.

See related questions:

108

Why does the Haskell function "do nothing", id, consume tons of memory?

fifteen

Why is there much more memory used in the free version of my function

7