Last time I discussed how to work with the Haskell Stack and random number generators using a simple Ecosystem Simulation as an example. This week I’ll continue using the Ecosystem Simulation example to discuss how to work with the *State* monad.

# State in Haskell

Since Haskell is a (mostly) pure functional language and each function is referentially transparent by default, you can’t just keep the state of a simulation in a variable and update it as the simulation evolves over time. Each call to your simulation must take in the current state and produce a new state as an output.

The simplest way to do this is to simply pass the old state as an argument to each function, but this results in a lot of extra code and makes your code into a chain of nested function calls like this:

function3 (state0, state1, state2) | (recursionComplete state0 state1 state2) = (state0, state1, state2) | otherwise = function3 ( function2 ( function1 ( function0 (state0, state1, state2))))

This code calls *function0*, *function1*, *function2*, and *function3* until the *recursionComplete* function evaluates to *True*. As you can see, the functions are listed in reverse order compared to the order in which they execute. Also note that this function would increase in size as the size of the tuple of state objects increases.

Ideally, we want something that looks more like an imperative program with classes where the functions are listed in order, like this recursive example using Python:

def function3(stateRecord): if recursionComplete(stateRecord): return stateRecord else: function0(stateRecord) function1(stateRecord) function2(stateRecord) return function3(stateRecord)

Notice how *function0*, *function1*, and *function2* don’t return a value? Since Python is not a pure functional language, *function0*, *function1*, and *function2* can update the variable *stateRecord* (depending on the contents of *function0*, *function1*, and *function2*; more information about that here).

Haskell doesn’t allow mutation of any constant, so we have to deal with passing constant values between function calls somehow. Fortunately, Haskell provides a solution to this problem which results in code which is relatively compact and where function calls are written in the same order as they will evaluate.

# The State monad

For the purpose of this post, let’s use this definition of a *State* monad from the understanding monads article:

newtype State s a = State {runState :: s -> (a, s)}

In practice, the *State* monad is actually defined using a monad transformer. I’ll be covering monad transformers in a later post when I’ve actually learned how they work.

You can import the *State* monad using the command:

import Control.Monad.State

As you can see above, the *State* monad actually wraps a function called *runState* which takes a state, of type *s*, and evaluates to a tuple, *(a, s)*, where *a* is the type of the result of the computation. So the *State* monad doesn’t represent a state itself, but rather a state processor which performs a stateful computation. In order to make a state processor, you can use the *state* function with another function as an argument:

incrementIntFunction :: Int -> (Int, Int) incrementIntFunction i = (i, i + 1) incrementInt :: State Int Int incrementInt = state incrementIntFunction

If you call *runState* with a *State* and an initial value, you get the result of the state processor, as well as a new state. For example, consider applying *incrementInt* to the initial value 4:

runState incrementInt 4 --Evaluates to (4, 5)

I like to think of the second value in the result tuple as some side-car data which is carried around by the computation.

Of course, the *State* monad is an instance of the *Monad* typeclass (you can read more about monads in my post Modeling Generalized Behaviors and Imprisoning Side Effects):

instance Monad (State s) where return :: a -> State s a return x = state (\s -> (x, s)) >>= :: State s a -> (a -> State s b) -> State s b p >>= k = state (\s0 -> let (x, s1) = runState p s0 in runState (k x) s1) >> :: State s a -> State s b -> State s b p0 >> p1 = state (\s0 -> let (x, s1) = runState p s0 in runState p1 s1)

Let’s examine what this means starting with the *return* function. In the case of the *State* monad, *return x* evaluates a state processor that takes any state and produces the value *x*. For example:

runState (return 5) "InitialState" --Evaluates to (5, "InitialState")

*return 5* evaluated to a state processor which produces the value *5*.

The *State* monad’s bind operation takes a state processor and a function which takes the output of the state processor and produces another state processor. It evaluates to a new state processor.

The >>= bind operation creates a new state processor which takes a state, *s0*, applies the first state processor, *p*, to that state using *runState*, producing a new value, *x*, and state *s1*. It then passes the new value into the function *k*, producing a new state processor *(k x)* which is applied to the new state *s1*.

The effect of all this is that the state processor *p* is run and its output is fed into the state processor created by *k*. This allows us to chain state processors together in a manner similar to an imperative program. For example:

runState (incrementInt >>= (\x -> state (\s -> (show (x * x), s + 1)))) 5 --Evaluates to ("25", 7)

This statement first runs *incrementInt* with an initial state of 5, which produces a tuple *(5, 6)* and then the value, *5*, is passed into the state processor, *state (\s -> (show (x * x), s + 1)))*, which squares the value and produces a tuple *(“25”, 7)*.

The >> bind operator is easier to understand, it simply runs the first state processor and uses the state of the first processor as the input to the second processor. For example:

runState (incrementInt >> incrementInt) 5 --Evaluates to (6, 7) runState (incrementInt >> (state (\x -> (show x, x)))) 5 --Evaluates to ("6", 6)

Notice how >> allows us to specify state processors in the same order as they are evaluated. It’s also useful that we don’t have to explicitly pass the current state to each state processor.

Now we can reproduce the *function3* example in a way that’s sort of similar to the imperative implementation:

data StateRecord = StateRecord {state0 :: TypeA, state1 :: TypeB, state2 :: TypeC} function0 :: State StateRecord StateRecord ... function1 :: State StateRecord StateRecord ... function2 :: State StateRecord StateRecord ... function3 :: State StateRecord StateRecord function3 = state (\s -> if recursionComplete s then (s, s) else runState (function0 >> function1 >> function2 >> function3) s)

I admit that it’s not quite as nice looking as the Python example, but the functions are at least listed in the same order they are executed and the state is not explicitly passed between the functions. If no conditionals are necessary, a state processor can simply consist of a chain of state processers interspersed with >> bind functions. With the use of do notation, *function3* can look a lot like imperative code:

function3 :: State StateRecord StateRecord function3 = state (\s -> if recursionComplete s then (s, s) else runState (do function0 function1 function2 function3) s)

I’m not a big fan of do notation in general, but I will concede that it makes a lot of sense to use it with the *State* monad.

Bear in mind that although this looks like imperative code, it’s still completely **referentially transparent** and, other than the side-car state data, it’s still **side-effect free**.

There are a few helper functions which are convenient to use with the *State* monad:

put :: s -> State s () put newState = state (\_ -> ((), newState)) get :: State s a get = state (\s -> (s, s))

*put* is used to insert a state into a stateful computation, ignoring whatever the state was before.

*get* is used to transfer the state into the value of our state processing, so it can be used in further computations.

evalState :: State s a -> s -> a evalState p s = fst (runState p s) execState :: State s a -> s -> s execState p s = snd (runState p s)

*evalState* is used to get the value from a state processor.

*execState* is used to get the state from a state processor.

# The state of the world

So how does this all work in a larger program with more state? In order to find out, I used the *State *monad to simulate a little world populated with virtual animals.

The entire ecosystem is modeled using a *WorldState* record type in src/World.hs:

data WorldState g = WorldState {iteration :: Int, io :: IO (), generator :: g, grid :: Matrix Creature}

The *WorldState* consists of an iteration count, an IO stream, a random number generator, and a grid of *Creatures*.

The *WorldState* is created using the function *makeWorld*:

makeWorld :: RandomGen g => Int -> IO () -> g -> Matrix Creature -> Maybe (WorldState g) makeWorld thisIteration thisIO thisGenerator thisGrid = Just (WorldState {iteration = thisIteration, io = thisIO, generator = thisGenerator, grid = thisGrid})

I added a call to *makeWorld* using the grid which I populated previously in *runSimulation*:

runSimulation :: IO () runSimulation = let width = 30 height = 30 initialGrid = initGrid width height generator = mkStdGen 126590563 (initialCount, newGenerator) = randomR (10 :: Int, floor ((fromIntegral (width * height)) * 0.1)) generator initialCoordinates = take initialCount (shuffle' ((,) <$> [1..width] <*> [1..height]) (width * height) newGenerator) initialPopulation = unfoldr (generatePopulation 70 25) (initialCoordinates, newGenerator) iGrid = populateGrid initialPopulation initialGrid in putStrLn ("Population simulation with " ++ (show initialCount) ++ " creatures.\n") >> performIO (evalState simulate (iGrid >>= makeWorld 0 (return ()) newGenerator))

Note that I used the >>= bind to bind the result of the *populateGrid* function, which is a *Maybe (Matrix Creature),* to the *makeWorld* function. The result is a *Maybe (Matrix Creature)*, which is used as the initial state of the simulation.

I call *evalState* on the *simulate* state processor, which performs the simulation on the world. Then I call *performIO* on the *Maybe (WorldState g)* to print the state of the world to standard output.

Before we dive into the *simulate* state processor, let’s quickly look at the definition of *performIO*:

performIO :: RandomGen g => Maybe (WorldState g) -> IO () performIO Nothing = return () performIO (Just (WorldState {io = thisIO})) = thisIO

*performIO* simply evaluates to the *IO ()* field of the *WorldState* record.

This is how I defined the *simulate* state processor:

simulate :: RandomGen g => State (Maybe (WorldState g)) (Maybe (WorldState g)) simulate = maybeStep simulateWorld >> maybeStep printWorld >> maybeStep waitWorld >> maybeStep incrementWorld >>= (\worldState -> case worldState of Nothing -> get Just (WorldState {iteration = thisIteration}) -> if thisIteration > 1000 then get else simulate)

The simulator consists of a chain of sub-state processors, chained with >> binds. The simulation first simulates the world, then prints it, waits for user input, increments the iteration counter and checks if *thisIteration* is > 1000, in which case I call get to terminate the simulation and move the state into the value of the state processor. If *thisIteration* is <= 1000, the simulation continues recursively.

*maybeStep* converts a function which operates on a *Maybe (WorldState g)* into a state processor which acts on a *Maybe (WorldState g)*:

maybeStep :: RandomGen g => (WorldState g -> (Maybe (WorldState g))) -> State (Maybe (WorldState g)) (Maybe (WorldState g)) maybeStep updateFunction = state (\worldState -> let newWorldState = worldState >>= updateFunction --worldState has type Mabye (WorldState g) in (newWorldState, newWorldState))

*simulateWorld* updates the state of the world:

simulateWorld :: RandomGen g => WorldState g -> Maybe (WorldState g) simulateWorld worldState = Just (updateWorld worldState)

*printWorld* adds a *printGrid* call to the IO stream of the *WorldState*:

printWorld :: RandomGen g => WorldState g -> Maybe (WorldState g) printWorld worldState@(WorldState {io = thisIO, grid = thisGrid}) = Just (setIO (thisIO >> printGrid thisGrid) worldState)

*waitWorld* adds some print-outs and a *getChar* to the IO stream of the *WorldState*:

waitWorld :: RandomGen g => WorldState g -> Maybe (WorldState g) waitWorld worldState@(WorldState {io = thisIO}) = Just (setIO (thisIO >> putStrLn "-----" >> hFlush stdout >> getChar >> return ()) worldState)

*incrementWorld* increments the current iteration of the *WorldState*:

incrementWorld :: RandomGen g => WorldState g -> Maybe (WorldState g) incrementWorld worldState = Just (incrementIteration worldState)

Working with the *State* monad made it easy to define the stages of the simulation, and adjust the order in which operations are performed. It also made it much easier to write concise function definitions when a sequence of operations needs to be performed, like in the simulate function. Compared to my method of handling state in Making a Text Adventure in Haskell, this is much more readable and easier to debug.

We’ll cover the *updateWorld* function, which updates the creatures in the world, in my next post.

The code for this simulation is available at Ecosystem Simulation.

Resources:

Haskell/Understanding monads/State