Runhaskell Performance Anomaly

Question

Runhaskell Performance Anomaly

I am trying to understand the performance anomaly observed when running a program under runhaskell .

This program:

 isFactor n = (0 ==) . (mod n) factors x = filter (isFactor x) [2..x] main = putStrLn $ show $ sum $ factors 10000000

When I run this, it takes 1.18 seconds.

However, if I redefine isFactor as:

 isFactor nf = (0 ==) (mod nf)

then the program takes 17.7 seconds.

This is a huge performance difference, and I would expect the programs to be equivalent. Does anyone know what I'm missing here?

Note. This does not happen when compiling under GHC.

+8

haskell runhaskell

Steve Feb 17 '12 at 8:40

source share

2 answers

As I understand it, runhaskell practically does not optimize. It is designed to quickly download and run code. If he made more optimizations, it would take more time to run your code. Of course, in this case, the code runs faster with optimization.

As I understand it, if a compiled version of the code exists, then runhaskell will use it. Therefore, if performance matters to you, just make sure that you are compiled with optimizations included first. (I think you can even pass switches to runhaskell to enable optimization - you will need to check the documentation ...)

+5

MathematicalOrchid Feb 17 '12 at 9:23

source share

John l · Accepted Answer · 2012-02-17T12:48:53+0000

Although the functions should be the same, there is a difference in how they are applied. The first time that isFactor defined, isFactor fully applied on the isFactor x call isFactor x . In the second definition, this is not so, because now isFactor explicitly accepts two arguments.

For GHC, minimal optimizations are enough to see this and create identical code for both, however, if you compile with -O0 -ddump-simpl , you can determine that without optimization this matters (at least with ghc-7.2. 1, YMMV with other versions).

At the first, isFactor GHC creates a single function that is passed as a predicate to "GHC.List.Filter", with calls to mod 10000000 and (==) in the lines. For the second definition, what happens is that most of the calls inside isFactor are associated with class functions and are not shared between multiple calls to isFactor . Thus, there are many excess vocabulary that are completely unnecessary.

This is almost never a problem, because even default compiler settings will optimize it, however runhaskell does not even seem to do this. However, I sometimes structured the code as someFun xy = \z -> because I knew that someFun would be partially applied, and that was the only way to keep the exchange between calls (i.e. the GHC Optimizer was not smart enough).

Runhaskell Performance Anomaly

More articles: