What things should I take care of if I use an unboxed type (like Int #) in Haskell / GHC?

Question

What things should I take care of if I use an unboxed type (like Int #) in Haskell / GHC?

I am trying to write a small script that analyzes and executes Brainfuck code in order to understand the GHC optimization options, I am trying to optimize the code to be a little faster and understand what is happening.

Inside the parts there is an internal representation of the BF code, for this I use a special data type. Here, the source code included two functions that perform conversions:

data BFinstruction = AdjustValue Int | MovePointer Int | GetChar | PutChar | Loop BFcode deriving (Eq) type BFcode = [BFinstruction] unsafeCompileBrainfuck :: String -> BFcode unsafeCompileBrainfuck = fst . parse [] where -- arguments: input string, built code; output: output code, rest of input parse :: BFcode -> String -> (BFcode,String) parse c ('+':s) = parse (AdjustValue 1 :c) s parse c ('-':s) = parse (AdjustValue (-1):c) s parse c ('>':s) = parse (MovePointer 1 :c) s parse c ('<':s) = parse (MovePointer (-1):c) s parse c ('.':s) = parse (PutChar :c) s parse c (',':s) = parse (GetChar :c) s parse c (']':s) = (reverse c, s) parse c ('[':s) = parse (Loop l :c) s' where (l,s') = parse [] s parse c [] = (reverse c ,"") parse c ( _ :s) = parse cs simplifyBrainfuck :: BFcode -> BFcode simplifyBrainfuck ((AdjustValue x):(AdjustValue y):zs) = if x + y /= 0 then simplifyBrainfuck (AdjustValue (x + y):zs) else simplifyBrainfuck zs simplifyBrainfuck ((MovePointer x):(MovePointer y):zs) = if x + y /= 0 then simplifyBrainfuck (MovePointer (x + y):zs) else simplifyBrainfuck zs simplifyBrainfuck (x :zs) = x: simplifyBrainfuck zs simplifyBrainfuck [] = []

The idea is that the code will be read from some input (line), prepared and simplified by the above code, and then executed by some other functions. (It is assumed that the entry is valid).

To optimize this example, I tried to unpack the Int parameters of the MovePointer and AdjustValue by doing the following:

 data BFinstruction -- BangPatterns = AdjustValue {-# UNPACK #-} !Int | MovePointer {-# UNPACK #-} !Int | GetChar | PutChar | Loop BFcode deriving (Eq)

This will turn the box type Int into unboxed, raw Int# , which is a detail of the GHc implementation. As I read, this option is only good in a few cases, so I want to ask what things I should pay attention to if I want to perform such an optimization. My goal is to allow the execution of BF code, taking advantage of Haskell-laziness (I want to archive that code can only be stored as needed in memory) and simplicity.

+6

optimization haskell ghc brainfuck

fuz Sep 04 '10 at 9:05

source share

2 answers

Is it really necessary? Do you encounter performance issues with this code, which in your opinion is the result of boxed values? If not, do not worry.

If you think this is the case, then this page in the GHC manual seems to provide the necessary restrictions in a convenient list format.

The main points, apparently, are that any interaction between polymorphic functions or names and unoccupied types that are not rejected by the compiler can still lead to unpleasant space leaks. Also, without trying, I suspect that you will not get exceptions thrown in the event of an overflow, for example, so you probably should find it yourself. A simple test can check if this is true or not.

+3

Gian Sep 04 '10 at 15:34

source share

John l · Accepted Answer · 2010-09-04T16:44:43+0000

It really looks like premature optimization for me. UNPACK is mostly useful when you have a very large number of BFInstruction sitting around. I doubt you will have enough brainf ** k code to make it useful. I agree with Gene, it should be simple enough to check, so do it first.

In any case, what you need to know with UNPACK values is that you do not want the compiler to reinstall them. You should be fine with numerical operations, but other than that you will need to carefully examine the functions that you use to see if you are ever using unused values as a lax argument. The only way to make sure is to look at the kernel or interface files to see which parameters are strict and which are not. As always, be sure to compile at least "-O".

What things should I take care of if I use an unboxed type (like Int #) in Haskell / GHC?

More articles: