Haskell Writing myLength

I worked on this page

http://www.haskell.org/haskellwiki/99_questions/Solutions/4

I understand what each function means, and we are interested to see that a function can be defined in a variety of ways like this. However, I was just starting to wonder which one is faster. And I thought that it would be what it says length in Prelude .

 length [] = 0 length (x:xs) = 1 + length xs 

However, this is much slower than length in Prelude .

On my computer, length in Prelude returns the length [1..10^7] in 0.37 seconds. However, the function defined above took 15.26 seconds.

I have defined my own length function that the battery uses. It took only 8.99 seconds.

I wonder why these big differences have occurred?

+4
source share
3 answers

When you say: " length in Prelude returns ... in 0.37 seconds," to whom are you referring? If you use GHC, you can see, for example, here that the actual implementation is different from simple

 length [] = 0 length (x:xs) = 1 + length xs 

Namely, these are:

 length l = len l 0# where len :: [a] -> Int# -> Int len [] a# = I# a# len (_:xs) a# = len xs (a# +# 1#) 

This code uses a drive and avoids the problem of huge unvalued thunks using unboxed integers, i.e. This version is highly optimized.

To illustrate the problem with the “simple” version, consider how length [1, 2, 3] is evaluated:

 length [1, 2, 3] => 1 + length [2, 3] => 1 + (1 + length [3]) => 1 + (1 + (1 + length [])) => 1 + (1 + (1 + 0)) 

The amount is not estimated until its result is needed, so you will see that when the input is a huge list, you first create a huge amount in memory, and then evaluate it only when its result is really needed.

In contrast, the optimized version is evaluated as follows:

 length [1, 2, 3] => len [1, 2, 3] 0# => len [2, 3] (1#) => len [3] (2#) => len [] (3#) => 3 

ie, "+1" is executed immediately.

+7
source

The two steps you must follow are:

  • Is compiled code being executed, not in ghci?
  • You use the -O2 flag

The following criteria were met by the criterion and used the following functions along with the length of the prelude , which requires MagicHash pragma and importing GHC.Base

 myLength1 :: [a] -> Int myLength1 [] = 0 myLength1 (x:xs) = 1 + myLength1 xs myLength2 :: [a] -> Int myLength2 lst = len lst 0 where len :: [a] -> Int -> Int len [] n = n len (_:xs) n = len xs (n+1) myLength3 :: [a] -> Int myLength3 l = len l 0# where len :: [a] -> Int# -> Int len [] a# = I# a# len (_:xs) a# = len xs (a# +# 1#) 

Results of a test fully found at the end using the -O2 tag:

  mean length : 5.4818 ms myLength1 : 202.1552 ms myLength2 : 236.3042 ms myLength3 : 5.3630 ms 

Now let's use the -02 flag when compiling

  mean length : 5.2597 ms myLength1 : 12.882 ms myLength2 : 5.2026 ms myLength3 : 5.6393 ms, 

Note that the length of myLength3 does not change, but the other two vary significantly. The naive approach is in 3 different myLength2 , and myLength2 now matched with the built-in length, simulating the length of the foreplay in all but using unboxing.

Also note that myLength3, which decompresses Int, does not change much and it will probably be much better to generate myLength 1 or 2 in ghci.

Full code: https://gist.github.com/Davorak/5457105

Edit: some additional information that does not fit into the comment:

The ghc -O2 flag with the letter means "Apply every non-hazardous optimization, even if it means significantly longer compilation times." I won’t be surprised if this involves unpacking data types. You can find additional explanations of the various flags here . Here is a link with a large list of flags for ghc 7.6.2 explanations can be short and cryptic.

I am not very familiar with unpacking and primitive operations, and their implications here are the third reference to the GHC manual , which covers unpacked types. Sometimes you will mention them in optimization manuals. In most cases, you should not worry about them unless you really need every gram of performance, because, as we said above, they will often make a constant difference after using other optimization flags.

+4
source

Prelude is a specification of semantics only; this does not limit the implementation. From http://www.haskell.org/onlinereport/haskell2010/haskellch9.html#x16-1710009 :

It is a specification of the Prelude. Many of the definitions are written with clarity, and not in terms of efficiency, and the specification is not required to be implemented, as shown here.

In the case of GHC, the actual function length highly optimized.

0
source

All Articles