Notes on Failing to Understand Software Transactional Memory
I am writing some library to easily implement event sourced services in Haskell based on previous experience at Capital Match, and while doing so I rewrote a simple file-based event store. This store communicates with core service using TBQueue, a bounded queue implemented over Haskell’s STM. It took me couple of hours on Friday to solve a BlockedIndefinitelyOnSTM bug I was facing while testing this simple store. So today I posted a question about STM on Stack Overflow, as I did not have a clear intuition on why my code was failing, hence why my fix was correct.
The code of the store boils down to the following simple model.
First some useful imports…
{-# LANGUAGE ScopedTypeVariables #-}
module Main where
import Control.Concurrent.Async
import Control.Concurrent.STM
import Control.Exception
import Control.Monad (forever)
import Hevents.Eff
import System.IO
The store dequeues some operation from a given queue, writes the operation’s string to stdout
then put back the length of the written string into a TMVar, which models communicating the result of the operation back to the caller.
data Op = Op String (TMVar Int)
storerun :: TBQueue Op -> IO ()
= do
storerun q <- openFile "store.test" ReadWriteMode
h NoBuffering
hSetBuffering h $ do
forever Op s v <- atomically $ readTBQueue q
hPutStrLn h s$ putTMVar v (length s) atomically
The main
function is responsible for creating the jobs queue, starting the “store” in a separate thread then reading lines from stdin
and feeding them as “operations” for the store.
main :: IO ()
= do
main <- newTBQueueIO 100
q <- async $ storerun q
_
storeInput qwhere
= forever $ do
storeInput q <- getLine
l <- newEmptyTMVarIO
v <- atomically $ do
r Op l v)
writeTBQueue q ( takeTMVar v
This code deadlocks because STM are actually - surprise! - transactions: They do all of their operations, or nothing, and they are serialized. Hence the following block:
<- atomically $ do
r Op l v)
writeTBQueue q ( takeTMVar v
…can succeeds if and only if it can atomically put an operation in the queue and read the result back from v
. Which of course is not possible because the result is put back after the operation is read from the queue in another transaction. The correct code is:
$ writeTBQueue q (Op l v)
atomically <- atomically $ takeTMVar v r
Pretty obvious, in retrospect. As the person who answered my question on SO, there is no way for two STM transactions to exchange information.