[edit: the question only mentions Concurrent Haskell, but the link to the document, it seems to me, is "Composable Memory Transactions", the document in which Haskell STM was first described. Please correct me if I am wrong.]
STM now works fine on multiple cores. A parallel implementation was first shipped to GHC 6.6 and uses a fine-grained two-phase blocking strategy; that is, to commit a transaction, the implementation first tries to lock every variable involved in the transaction, then makes the changes and finally opens all the variables. The acquisition of a lock is not blocked: if the lock is already held, the transaction is aborted and retries (this avoids the usual lock-cancel order blocking, which will be applied if the lock lock is locked).
This STM implementation, of course, is not the fastest - many alternative methods are described in the literature that will lead to better performance, but the GHC implementation is relatively simple and does not include any global locks (transactions working with different sets of variables can be executed in parallel without interference )
Simon marlow
source share