Summing from 1 to 1,000,000 in Haskell gives a stack overflow. What happens under the hood?

I have the following code.

main = print $ sum [1..1000000] 

When I start, I get a stack overflow:

 Stack space overflow: current size 8388608 bytes. Use `+RTS -Ksize -RTS' to increase it. 

I'm used to imperative languages ​​like Python, which don't seem to have problems with this calculation:

 sum(range(100000000)) # I'm not even using a generator. 4999999950000000 

Haskell is clearly different, but I don’t quite understand what is going on to cause a stack overflow? What happens under the hood to cause a stack overflow in Haskell?

+6
source share
2 answers

This whole question is relevant only for GHC <7.10. In recent versions, sum [1..1000000] works fine in constant space, at least on built-in number types.


sum implemented with evil foldl 1 , which is not as strict as it should be. So what you get from sum is essentially a bunch of tricks the size of your input. I think there was a discussion about why this is done here at some point ... IMO this is basically just stupid, since the amounts usually cannot be consumed lazily, anyway, it is simply obvious that using strict help .

Prelude>: m + Data.List
Prelude Data.List> foldl '(+) 0 [1..1000000]
500000500000


1 Actually, foldl used only in the report version ... but the explicit recursive version with battery, of course, is not better.

+10
source

sum defined in terms of foldl , which is lazy in the left associative form, so that it must generate thunks for the entire list before evaluating one (in this case, complement) expression.

You can also define sum in terms of foldl more stringent analog to foldl' as follows:

 sum' = foldl' (+) 0 

See Foldr. Foldl. Foldl. from the Haskell Wiki for a good explanation of how foldl should generate thunks for each calculation, not being able to evaluate anything that would cause a stack overflow.

+3
source

All Articles