Comparing Haskell and C speed to calculate prime numbers

I originally wrote this method (brute force and inefficiency) to calculate primes in order to make sure that there is no difference in speed between using "if-then-else" and protection in Haskell (and there is no difference!). But then I decided to write a C program for comparison, and I got the following (Haskell is slower just over 25%):

(Note: I got the idea of ​​using rem instead of mod, as well as the O3 option in calling the compiler from the following message: On improving Haskell performance compared to C in the Fibonacci micro test )

Haskell: Forum.hs

divisibleRec :: Int -> Int -> Bool divisibleRec ij | j == 1 = False | i `rem` j == 0 = True | otherwise = divisibleRec i (j-1) divisible::Int -> Bool divisible i = divisibleRec i (i-1) r = [ x | x <- [2..200000], divisible x == False] main :: IO() main = print(length(r)) 

C: main.cpp

 #include <stdio.h> bool divisibleRec(int i, int j){ if(j==1){ return false; } else if(i%j==0){ return true; } else{ return divisibleRec(i,j-1); } } bool divisible(int i){ return divisibleRec(i, i-1); } int main(void){ int i, count =0; for(i=2; i<200000; ++i){ if(divisible(i)==false){ count = count+1; } } printf("number of primes = %d\n",count); return 0; } 

The results that I received were as follows:

Compilation time

 time (ghc -O3 -o runProg Forum.hs) real 0m0.355s user 0m0.252s sys 0m0.040s time (gcc -O3 -o runProg main.cpp) real 0m0.070s user 0m0.036s sys 0m0.008s 

and the following runtime:

Ubuntu 32-Bit Runtime

 Haskell 17984 real 0m54.498s user 0m51.363s sys 0m0.140s C++ number of primes = 17984 real 0m41.739s user 0m39.642s sys 0m0.080s 

I was impressed with the running time of Haskell. However, my question is this: can I do something to speed up the haskell program without:

  • Changing the basic algorithm (it is clear that mass accelerations can be achieved by changing the algorithm, but I just want to understand what I can do on the language / compiler side to improve performance).
  • Invoking the llvm compiler (because I don't have one)

[EDIT: memory usage]

After Alan's comment, I noticed that C uses a constant amount of memory, where as Haskell slowly grows in memory size. At first I thought this was related to recursion, but gspr explains below why this is happening and provides a solution. Will Ness provides an alternative solution that (e.g. gspr solution) also ensures that memory remains static.

[EDIT: Short Circulation Summary]

Maximum number tested: 200,000:

(54.498s / 41.739s) = Haskell 30.5% slower

Maximum number tested: 400,000:

3m31.372s / 2m45.076s = 211.37s / 165s = Haskell 28.1% slower

Maximum number of tests: 800,000:

14m3.266s / 11m6.024s = 843.27s / 666.02s = Haskell 26.6% slower

[EDIT: Code for Alan]

This was the code I wrote earlier that has no recursion and which I tested for 200,000:

 #include <stdio.h> bool divisibleRec(int i, int j){ while(j>0){ if(j==1){ return false; } else if(i%j==0){ return true; } else{ j -= 1;} } } bool divisible(int i){ return divisibleRec(i, i-1); } int main(void){ int i, count =0; for(i=2; i<8000000; ++i){ if(divisible(i)==false){ count = count+1; } } printf("number of primes = %d\n",count); return 0; } 

The results for C code with and without recursion are as follows (for 800,000):

With recursion: 11m6.024s

No recursion: 11m5.328s

Note that the executable seems to occupy 60kb (as seen on the system monitor), regardless of the maximum number, and therefore I suspect the compiler is detecting this recursion.

+7
source share
4 answers

This does not answer your question, but rather what you asked in the comment regarding the increase in memory usage when the number 200000 grows.

When this number grows, a list r also appears. To calculate its length requires only r at the very end. C code, on the other hand, simply increments the counter. You will also have to do something similar in Haskell if you want to use read-only memory. The code will still be very Haskelly, and overall this is a reasonable suggestion: you really don't need a list of numbers for which divisible is False , you just need to know how many there are.

You can try with

 main :: IO () main = print $ foldl' (\sx -> if divisible x then s else s+1) 0 [2..200000] 

( foldl' is a more stringent foldl from Data.List , which avoids the creation of tunnels).

+5
source

Well patches give you a small victory (like llvm, but you seemed to expect this):

 {-# LANUGAGE BangPatterns #-} divisibleRec !i !j | j == 1 = False 

And on my x86-64, I get a very big gain by switching to smaller representations like Word32:

 divisibleRec :: Word32 -> Word32 -> Bool ... divisible :: Word32 -> Bool 

My timings:

 $ time ./so -- Int 2262 real 0m2.332s $ time ./so -- Word32 2262 real 0m1.424s 

This is closer to your C program that uses only int . It still does not match the performance, I suspect that we will have to look at the kernel to understand why.

EDIT: and memory usage, as already noted, I see refers to the named list r . I just entered r , made a conclusion 1 for each value without division and took the sum:

 main = print $ sum $ [ 1 | x <- [2..800000], not (divisible x) ] 
+2
source

Another way to write your algorithm:

 main = print $ length [()|x<-[2..200000], and [rem x d>0|d<-[x-1,x-2..2]]] 

Unfortunately, it is slower. Using all ((>0).rem x) [x-1,x-2..2] as a test, it runs slower. But maybe you would test this on your installation.

Replacing the code with an explicit loop using beat patterns does not make any difference:

 {-# OPTIONS_GHC -XBangPatterns #-} r4::Int->Int r4 n = go 0 2 where go !ci | i>n = c | True = go (if not(divisible i) then (c+1) else c) (i+1) divisibleRec::Int->Int->Bool divisibleRec i !j | j == 1 = False | i `rem` j == 0 = True | otherwise = divisibleRec i (j-1) 
+1
source

When I started programming in Haskell, I was also impressed with its speed. You may be interested in reading paragraph 5 "Haskell Speed" in this article.

0
source

All Articles