I think the time has come to answer this question.
What happened to your code with -O
Let me enlarge your main function and rewrite it a bit:
main :: IO () main = do [n, m] <- fmap (map read . words) getLine line <- getLine let nodes = listArray (0, n) . tonodes n . map (subtract 1) . map read . words $ line replicateM_ m $ query n nodes
Obviously, the intention here is that a NodeArray is created once and then used in each of the m invocations query .
Unfortunately, GHC efficiently converts this code,
main = do [n, m] <- fmap (map read . words) getLine line <- getLine replicateM_ m $ do let nodes = listArray (0, n) . tonodes n . map (subtract 1) . map read . words $ line query n nodes
and you can immediately see the problem here.
What is state hacking and why does it destroy the performance of my programs
The reason is a hacking state that says (approximately): "When something is of type IO a , suppose it is called only once." The official documentation is not much more complicated:
-fno-state-hack
Disable "state hacking", in which any lambda with the state marker # as an argument is considered single, therefore it is believed that there are built-in elements inside it. This can improve the performance of I / O and ST codes, but it reduces the risk of sharing.
Roughly speaking, the idea is this: if you define a function with type IO and a where clause, for example
foo x = do putStrLn y putStrLn y where y = ...x...
Something like IO a can be thought of as something like RealWord -> (a, RealWorld) . In this regard, the above becomes (approximately)
foo x = let y = ...x... in \world1 -> let (world2, ()) = putStrLn y world1 let (world3, ()) = putStrLn y world2 in (world3, ())
A call to foo will (usually) look like this: foo argument world . But the definition of foo takes only one argument, and the second only a local lambda expression! This will be a very slow foo call. It would be much faster if the code looked like this:
foo x world1 = let y = ...x... in let (world2, ()) = putStrLn y world1 let (world3, ()) = putStrLn y world2 in (world3, ())
This is called an eta extension and is performed for various reasons (for example, analyzing the definition of functions , checking what it is called , and - in this case, the type of directional heuristic).
Unfortunately, this is unreasonable if the call to foo actually has the form let fooArgument = foo argument , that is, with the argument, but no world has passed (for now). In the source code, if fooArgument used multiple times, y will be evaluated only once and shared. In the modified code, y will be recalculated every time - exactly what happened to your nodes .
Can things be fixed?
Maybe. See # 9388 for trying to do this. The problem with the fix is ββthat in many cases it will cost performance when the conversion happens fine, although the compiler cannot know for sure. And, probably, there are cases when technically this is not normal, i.e. Separation is lost, but it is still useful because the acceleration from a faster call outweighs the additional cost of recounting. Therefore, it is not clear where to go from here.