Making an Ecosystem Simulation in Haskell (Part 2)

Last time I discussed how to work with the Haskell Stack and random number generators using a simple Ecosystem Simulation as an example. This week I’ll continue using the Ecosystem Simulation example to discuss how to work with the State monad.

State in Haskell

Since Haskell is a (mostly) pure functional language and each function is referentially transparent by default, you can’t just keep the state of a simulation in a variable and update it as the simulation evolves over time. Each call to your simulation must take in the current state and produce a new state as an output.

The simplest way to do this is to simply pass the old state as an argument to each function, but this results in a lot of extra code and makes your code into a chain of nested function calls like this:

function3 (state0, state1, state2)
    | (recursionComplete state0 state1 state2) = (state0, state1, state2)
    | otherwise = function3 (
                      function2 (
                          function1 (
                              function0 (state0, state1, state2))))

This code calls function0, function1, function2, and function3 until the recursionComplete function evaluates to True. As you can see, the functions are listed in reverse order compared to the order in which they execute. Also note that this function would increase in size as the size of the tuple of state objects increases.

Ideally, we want something that looks more like an imperative program with classes where the functions are listed in order, like this recursive example using Python:

def function3(stateRecord):
    if recursionComplete(stateRecord):
        return stateRecord
    else:
        function0(stateRecord)
        function1(stateRecord)
        function2(stateRecord)
        return function3(stateRecord)

Notice how function0, function1, and function2 don’t return a value? Since Python is not a pure functional language, function0, function1, and function2 can update the variable stateRecord (depending on the contents of function0, function1, and function2; more information about that here).

Haskell doesn’t allow mutation of any constant, so we have to deal with passing constant values between function calls somehow. Fortunately, Haskell provides a solution to this problem which results in code which is relatively compact and where function calls are written in the same order as they will evaluate.

The State monad

For the purpose of this post, let’s use this definition of a State monad from the understanding monads article:

newtype State s a = State {runState :: s -> (a, s)}

In practice, the State monad is actually defined using a monad transformer. I’ll be covering monad transformers in a later post when I’ve actually learned how they work.

You can import the State monad using the command:

import Control.Monad.State

As you can see above, the State monad actually wraps a function called runState which takes a state, of type s, and evaluates to a tuple, (a, s), where a is the type of the result of the computation. So the State monad doesn’t represent a state itself, but rather a state processor which performs a stateful computation. In order to make a state processor, you can use the state function with another function as an argument:

incrementIntFunction :: Int -> (Int, Int)
incrementIntFunction i = (i, i + 1)

incrementInt :: State Int Int
incrementInt = state incrementIntFunction

If you call runState with a State and an initial value, you get the result of the state processor, as well as a new state. For example, consider applying incrementInt to the initial value 4:

runState incrementInt 4 --Evaluates to (4, 5)

I like to think of the second value in the result tuple as some side-car data which is carried around by the computation.

Of course, the State monad is an instance of the Monad typeclass (you can read more about monads in my post Modeling Generalized Behaviors and Imprisoning Side Effects):

instance Monad (State s) where
    return :: a -> State s a
    return x = state (\s -> (x, s))

    >>= :: State s a -> (a -> State s b) -> State s b
    p >>= k = state (\s0 ->
                     let (x, s1) = runState p s0
                     in runState (k x) s1)

    >> :: State s a -> State s b -> State s b
    p0 >> p1 = state (\s0 ->
                       let (x, s1) = runState p s0
                       in runState p1 s1)

Let’s examine what this means starting with the return function. In the case of the State monad, return x evaluates a state processor that takes any state and produces the value x. For example:

runState (return 5) "InitialState" --Evaluates to (5, "InitialState")

return 5 evaluated to a state processor which produces the value 5.

The State monad’s bind operation takes a state processor and a function which takes the output of the state processor and produces another state processor. It evaluates to a new state processor.

The >>= bind operation creates a new state processor which takes a state, s0, applies the first state processor, p, to that state using runState, producing a new value, x, and state s1. It then passes the new value into the function k, producing a new state processor (k x) which is applied to the new state s1.

The effect of all this is that the state processor p is run and its output is fed into the state processor created by k. This allows us to chain state processors together in a manner similar to an imperative program. For example:

runState (incrementInt >>= (\x -> state (\s -> (show (x * x), s + 1)))) 5 --Evaluates to ("25", 7)

This statement first runs incrementInt with an initial state of 5, which produces a tuple (5, 6) and then the value, 5, is passed into the state processor, state (\s -> (show (x * x), s + 1))), which squares the value and produces a tuple (“25”, 7).

The >> bind operator is easier to understand, it simply runs the first state processor and uses the state of the first processor as the input to the second processor. For example:

runState (incrementInt >>
          incrementInt) 5 --Evaluates to (6, 7)
runState (incrementInt >>
          (state (\x -> (show x, x)))) 5 --Evaluates to ("6", 6)

Notice how >> allows us to specify state processors in the same order as they are evaluated. It’s also useful that we don’t have to¬†explicitly pass the current state to each state processor.

Now we can reproduce the function3 example in a way that’s sort of similar to the imperative implementation:

data StateRecord = StateRecord {state0 :: TypeA,
                                state1 :: TypeB,
                                state2 :: TypeC}

function0 :: State StateRecord StateRecord
...

function1 :: State StateRecord StateRecord
...

function2 :: State StateRecord StateRecord
...

function3 :: State StateRecord StateRecord
function3 = state (\s ->
                   if recursionComplete s
                   then (s, s)
                   else runState (function0 >>
                                  function1 >>
                                  function2 >>
                                  function3) s)

I admit that it’s not quite as nice looking as the Python example, but the functions are at least listed in the same order they are executed and the state is not explicitly passed between the functions. If no conditionals are necessary, a state processor can simply consist of a chain of state processers interspersed with >> bind functions. With the use of do notation, function3 can look a lot like imperative code:

function3 :: State StateRecord StateRecord
function3 = state (\s ->
                   if recursionComplete s
                   then (s, s)
                   else runState (do function0
                                     function1
                                     function2
                                     function3) s)

I’m not a big fan of do notation in general, but I will concede that it makes a lot of sense to use it with the State monad.

Bear in mind that although this looks like imperative code, it’s still completely referentially transparent and, other than the side-car state data, it’s still side-effect free.

There are a few helper functions which are convenient to use with the State monad:

put :: s -> State s ()
put newState = state (\_ -> ((), newState))

get :: State s a
get = state (\s -> (s, s))

put is used to insert a state into a stateful computation, ignoring whatever the state was before.

get is used to transfer the state into the value of our state processing, so it can be used in further computations.

evalState :: State s a -> s -> a
evalState p s = fst (runState p s)

execState :: State s a -> s -> s
execState p s = snd (runState p s)

evalState is used to get the value from a state processor.

execState is used to get the state from a state processor.

The state of the world

So how does this all work in a larger program with more state? In order to find out, I used the State monad to simulate a little world populated with virtual animals.

The entire ecosystem is modeled using a WorldState record type in src/World.hs:

data WorldState g = WorldState {iteration :: Int,
                                io :: IO (),
                                generator :: g,
                                grid :: Matrix Creature}

The WorldState consists of an iteration count, an IO stream, a random number generator, and a grid of Creatures.

The WorldState is created using the function makeWorld:

makeWorld :: RandomGen g => Int -> IO () -> g -> Matrix Creature -> Maybe (WorldState g)
makeWorld thisIteration thisIO thisGenerator thisGrid
    = Just (WorldState {iteration = thisIteration,
                        io = thisIO,
                        generator = thisGenerator,
                        grid = thisGrid})

I added a call to makeWorld using the grid which I populated previously in runSimulation:

runSimulation :: IO ()
runSimulation = let width = 30
                    height = 30
                    initialGrid = initGrid width height
                    generator = mkStdGen 126590563
                    (initialCount, newGenerator) = randomR (10 :: Int, floor ((fromIntegral (width * height)) * 0.1)) generator
                    initialCoordinates = take initialCount (shuffle' ((,) <$> [1..width] <*> [1..height]) (width * height) newGenerator)
                    initialPopulation = unfoldr (generatePopulation 70 25) (initialCoordinates, newGenerator)
                    iGrid = populateGrid initialPopulation initialGrid
                in putStrLn ("Population simulation with " ++ (show initialCount) ++ " creatures.\n") >>
                   performIO (evalState simulate (iGrid >>= makeWorld 0 (return ()) newGenerator))

Note that I used the >>= bind to bind the result of the populateGrid function, which is a Maybe (Matrix Creature), to the makeWorld function. The result is a Maybe (Matrix Creature), which is used as the initial state of the simulation.

I call evalState on the simulate state processor, which performs the simulation on the world. Then I call performIO on the Maybe (WorldState g) to print the state of the world to standard output.

Before we dive into the simulate state processor, let’s quickly look at the definition of performIO:

performIO :: RandomGen g => Maybe (WorldState g) -> IO ()
performIO Nothing = return ()
performIO (Just (WorldState {io = thisIO})) = thisIO

performIO simply evaluates to the IO () field of the WorldState record.

This is how I defined the simulate state processor:

simulate :: RandomGen g => State (Maybe (WorldState g)) (Maybe (WorldState g))
simulate = maybeStep simulateWorld >>
           maybeStep printWorld >>
           maybeStep waitWorld >>
           maybeStep incrementWorld >>=
           (\worldState -> 
             case worldState of
             Nothing -> get
             Just (WorldState {iteration = thisIteration})
                -> if thisIteration > 1000
                   then get
                   else simulate)

The simulator consists of a chain of sub-state processors, chained with >> binds. The simulation first simulates the world, then prints it, waits for user input, increments the iteration counter and checks if thisIteration is > 1000, in which case I call get to terminate the simulation and move the state into the value of the state processor. If thisIteration is <= 1000, the simulation continues recursively.

maybeStep converts a function which operates on a Maybe (WorldState g) into a state processor which acts on a Maybe (WorldState g):

maybeStep :: RandomGen g => (WorldState g -> (Maybe (WorldState g))) -> State (Maybe (WorldState g)) (Maybe (WorldState g))
maybeStep updateFunction = state (\worldState -> let newWorldState = worldState >>= updateFunction --worldState has type Mabye (WorldState g)
                                                 in (newWorldState, newWorldState))

simulateWorld updates the state of the world:

simulateWorld :: RandomGen g => WorldState g -> Maybe (WorldState g)
simulateWorld worldState = Just (updateWorld worldState)

printWorld adds a printGrid call to the IO stream of the WorldState:

printWorld :: RandomGen g => WorldState g -> Maybe (WorldState g)
printWorld worldState@(WorldState {io = thisIO, grid = thisGrid})
    = Just (setIO (thisIO >> printGrid thisGrid) worldState)

waitWorld adds some print-outs and a getChar to the IO stream of the WorldState:

waitWorld :: RandomGen g => WorldState g -> Maybe (WorldState g)
waitWorld worldState@(WorldState {io = thisIO})
    = Just (setIO (thisIO >> putStrLn "-----" >> hFlush stdout >> getChar >> return ()) worldState)

incrementWorld increments the current iteration of the WorldState:

incrementWorld :: RandomGen g => WorldState g -> Maybe (WorldState g)
incrementWorld worldState = Just (incrementIteration worldState)

Working with the State monad made it easy to define the stages of the simulation, and adjust the order in which operations are performed. It also made it much easier to write concise function definitions when a sequence of operations needs to be performed, like in the simulate function. Compared to my method of handling state in Making a Text Adventure in Haskell, this is much more readable and easier to debug.

We’ll cover the updateWorld function, which updates the creatures in the world, in my next post.

The code for this simulation is available at Ecosystem Simulation.

Resources:

State Monad

Haskell/Understanding monads/State

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s