Modeling Generalized Behaviors and Imprisoning Side Effects

Generalizing behaviors with type classes

Often when writing code, you’ll find yourself writing functions for several data structures which have the same name and general structure. This results in a lot of code duplication, so you start thinking of ways to reduce the amount of code you’ll end up having to maintain. You may have heard of this concept being referred to as Don’t Repeat Yourself (DRY).
If you’re an imperative programmer, you start thinking about how to use the abstractions you are familiar with to deal with this problem, things like class hierarchies and generics.
Haskell has a simple way of capturing these shared behaviors in one place, using type classes. Let’s look at a simple type class:

type Distance = Float
type Resource = Float

class Vehicle v where
    travel :: v -> Distance -> (Distance, v)
    refill :: v -> Resource -> v

The type class has a parameter v, which represents the specific type. This type class defines the behaviors that are common to all vehicles; they travel some distance and eventually need to be refilled with some resource, like fuel or electricity, which they require to keep moving.
travel evaluates to a tuple with the new state of the vehicle and the distance which was traveled. refill just evaluates to a vehicle in its new refilled state.

Let’s look at a type that we might want to define as an instance of Vehicle:

data Gasoline = Gasoline Float
data Efficiency = Efficiency Float

data Car = Car Gasoline Efficiency

Car represents a car which has a tank full of gasoline and an efficiency in miles per gallon.

We can declare that Car behaves like a Vehicle by declaring that it is an instance of the Vehicle type class using the instance keyword:

instance Vehicle Car where
    travel (Car (Gasoline gas) (Efficiency efficiency)) distanceToTravel =
        (distanceTraveled,
         Car (Gasoline gasRemaining) (Efficiency efficiency))
            where gasToSpend = distanceToTravel / efficiency
                  gasDifference = gas - gasToSpend
                  gasRemaining = max gasDifference 0.0
                  gasSpent = gasToSpend - (max (-gasDifference) 0.0)
                  distanceTraveled = gasSpent * efficiency
    refill (Car (Gasoline gas) (Efficiency efficiency)) refillGas = Car (Gasoline (gas + refillGas)) (Efficiency efficiency)

The where expression here allows us to define local constants to be used in the function body. Simply create a list of constant names, followed by = and the constant definition. You can only use constants defined by where in the function body where they are defined. Local constants can be useful to avoid re-computing values by storing the result of a function call in the constant.
In order to declare that Car behaves like a Vehicle, we need to define the travel and refill functions for Car.
In the travel function, we define that the distance traveled can’t be any more than the gas in the tank will permit it to. The function also evaluates to a Car with some gas spent, ensuring that the gas can’t fall below zero.
We define refill for Car to simply evaluate to a Car with an increased amount of gas.

Since Vehicle represents a set of behaviors that are shared by multiple types, let’s look at another instance of Vehicle:

data Hamburger = Hamburger Int

data Unicycle = Unicycle Hamburger

instance Vehicle Unicycle where
    travel (Unicycle (Hamburger hamburgers)) distanceToTravel =
        (distanceTraveled,
         Unicycle (Hamburger hamburgersRemaining))
            where hamburgersToSpend = round distanceToTravel
                  hamburgersDifference = hamburgers - hamburgersToSpend
                  hamburgersRemaining = max hamburgersDifference 0
                  hamburgersSpent = hamburgersToSpend - (max (-hamburgersDifference) 0)
                  distanceTraveled = fromIntegral hamburgersSpent
    refill (Unicycle (Hamburger hamburgers)) hamburgersRefill = Unicycle (Hamburger (hamburgers + (round hamburgersRefill)))

It’s necessary to round down with round because we defined that the function should take a Float and evaluate to a Float in the type class.
Here we see a Unicycle which is powered by Hamburgers (the person riding the unicycle needs lots of calories to cover the distance). travel consumes one Hamburger per mile and refill only replenishes Hamburgers up to the nearest whole number.
This example demonstrates that the type instances of a type class can be completely different. They can have different value constructor type signatures and even different data, as long as they implement the interface defined by the type class.

Let’s look at how we can write a function which operates on a type class:

journey :: Vehicle v => (Distance, v) -> [Distance] -> (Distance, v)
journey (distanceTraveled, vehicle) [] = (distanceTraveled, vehicle)
journey (distanceTraveled, vehicle) (d : ds) =
    journey ((distanceTraveled + distanceTraveledNow), vehicleNow) ds
        where (distanceTraveledNow, vehicleNow) = travel vehicle d

You can see that there is a generic type v in the parameter list. Because we want to define that v is a Vehicle, we declare that v is an instance of the Vehicle type class in the section following the :: and before the => symbols in the function signature.
This function will work with any type that is an instance of Vehicle, so both of the following function evaluations are valid:

journey 0.0 (Car (Gasoline 50.0) (Efficiency 1.4)) [1.0, 2.0, 5.0, 10.0]
journey 0.0 (Unicycle (Hamburger 10)) [2.0, 3.0]

You can also define a function with multiple type classes. For example, suppose we have another type class Passenger, we could define a function which used both. The function signature might look something like this:

rideVehicle :: (Vehicle v, Passenger p) => v -> p -> Distance

Meta-types and type-classes

It’s possible for a meta-type to be an instance of a type class. For example, suppose we have defined a Hybrid meta-type that can take a type of fuel as a parameter:

data Hybrid fuel = Hybrid fuel

For now we’ll assume that it’s possible to get the efficiency of a fuel by evaluating:

efficiency fuel

We’ll also assume that there are toFloat and fromFloat functions for the fuel type.
We can make this an instance of the Vehicle type class in the same way that we did for Car and Unicycle above:

instance Vehicle (Hybrid f) where
    travel (Hybrid fuel) distanceToTravel =
        (distanceTraveled,
         Hybrid (fromFloat fuelRemaining))
            where fuelToSpend = distanceToTravel / (efficiency fuel)
                  fuelDifference = (toFloat fuel) - fuelToSpend
                  fuelRemaining = max fuelDifference 0.0
                  fuelSpent = fuelToSpend - (max (-fuelDifference) 0.0)
                  distanceTraveled = fuelSpent * (efficiency fuel)
    refill (Hybrid fuel) refillFuel = Hybrid (refuel fuel (fromFloat refillFuel))

There is a small problem with this Hybrid class, it is an instance of Vehicle which can take anything as fuel. It doesn’t make much sense to have a Hybrid contain Int or String as fuel. We should constrain the Hybrid so that it can only contain valid fuel.

The first step in constraining the fuel type is to make a type class for types which behave like fuel:

class Fuel f where
    fromFloat :: Float -> f
    toFloat :: f -> Float
    efficiency :: f -> Float
    refuel :: f -> f -> f

We’ve made it so that fuel efficiency can be measured with efficiency and that fuel can be refueled with refuel.

Let’s make an instance of this Fuel type class:

data Biofuel = Biofuel Float

instance Fuel Biofuel where
    fromFloat f = Biofuel f
    toFloat (Biofuel f) = f
    efficiency (Biofuel f) = 45.0
    refuel (Biofuel a) (Biofuel b) = (Biofuel (a + b))

Now we can use Biofuel as fuel for our Hybrid.

Let’s ensure that the fuel type parameter which we used in the instance declaration for Hybrid is a valid instance of Fuel. All we need to do is change the first line of the instance statement for Hybrid which we wrote before:

instance (Fuel f) => Vehicle (Hybrid f) where
    travel (Hybrid fuel) distanceToTravel =
        (distanceTraveled,
         Hybrid (fromFloat fuelRemaining))
            where fuelToSpend = distanceToTravel / (efficiency fuel)
                  fuelDifference = (toFloat fuel) - fuelToSpend
                  fuelRemaining = max fuelDifference 0.0
                  fuelSpent = fuelToSpend - (max (-fuelDifference) 0.0)
                  distanceTraveled = fuelSpent * (efficiency fuel)
    refill (Hybrid fuel) refillFuel = Hybrid (refuel fuel (fromFloat refillFuel))

The difference here is that we declare the type class of f in the section between the instance keyword and the => symbol.
Now we have a compile time constraint that f must be an instance of the type class Fuel if we evaluate travel or refill.

Subclassing

Sometimes, we want to create a type class which has all of the behaviors of an existing type class but adds more. For example, suppose we want to have a type class PassengerVehicle which adds two more behaviors to board and disembark passengers, we could make the following type class:

class (Vehicle v) => PassengerVehicle v where
    board :: v -> Int -> v
    disembark :: v -> Int -> v

As you can see, we declare the superclass between class and the => symbol.
In addition to board and disembark, instances of the PassengerVehicle type class will have to define all of the behaviors of its superclass, Vehicle.

Common type classes

Let’s look at some common type classes in Haskell.

Eq

Eq represents types which can be compared to one another to determine if they are equal. Eq defines two behaviors, == and /=, which we’ve seen before.

class  Eq a  where
        (==), (/=)  ::  a -> a -> Bool
        x /= y  = not (x == y)
        x == y  = not (x /= y)

Eq demonstrates another type of constraint that can be applied to a type class. You can write Boolean constraints defining the relationship between behaviors in a type class. Here, the Haskell standard defines that == and /= are Boolean complements of one another for any pair of instances of the Eq class. If you define an instance of Eq, you should make sure that this is true for your instance.
Some instances of Eq include: Bool, Int, Integer, Float, Double, Char, List, and Tuples.

Ord

Ord represents types which can be greater or less than one another. In other words, it is possible to order instances of this type class from least to greatest with a total ordering. Ord is a subclass of Eq, as you can see, a is both an Ord and an Eq in the class declaration.
This type class depends on a type called Ordering, which has five value constructors, EQ, LT, GT, LE, GE, representing equal to, less than, greater than, less than or equal to, and greater than or equal to.

class  (Eq a) => Ord a  where
    compare              :: a -> a -> Ordering
    () :: a -> a -> Bool
    max, min             :: a -> a -> a
    compare x y | x == y    = EQ
                | x <= y    = LT
                | otherwise = GT
...

We can see some new syntax here in the compare function. The | symbols are used to define guards, which we will cover in the next post.

Some instances of Ord include: Bool, Int, Integer, Float, Double, Char, List, and Tuples.

Read and Show

Read and Show represent types which can be converted from and to String respectively.

class  Read a  where
...

class  Show a  where
    show      :: a -> String 
...

You can convert an instance of Show to a String using the show function:

show 5 --This evaluates to "5", or equivalently ['5']

You can convert an instance of Read from a String into the instance type using the read function:

read "5" + 50 --This evaluates to the integral value 55

Some instances of Read and Show include: Bool, Int, Integer, Float, and Double.

Enum

Enum represents types which can be ordered into a sequence.

class  Enum a  where
    succ, pred     :: a -> a
    toEnum         :: Int -> a
    fromEnum       :: a -> Int
    enumFrom       :: a -> [a]            -- [n..]
    enumFromThen   :: a -> a -> [a]       -- [n,n'..]
    enumFromTo     :: a -> a -> [a]       -- [n..m]
    enumFromThenTo :: a -> a -> a -> [a]  -- [n,n'..m]

You can get the predecessor of a value with the pred function:

pred 2 --This evaluates to 1

You can get the successor of a value with the succ function:

succ 'a' --This evaluates to 'b'

Some instances of Enum include: Int, Integer, and Char.

Bounded

class  Bounded a  where
    minBound, maxBound :: a

Bounded represents types which have a minimum and maximum value.
Instances of Bounded include: Int, Integer, Char, Float, and Double.

Making instances of standard type classes

It’s very easy to make your own type into an instance of a standard type class. For example, let’s make our Unicycle and Hamburger types into instances of standard types:

data Hamburger = Hamburger Int deriving (Show, Eq, Ord, Bounded)

data Unicycle = Unicycle Hamburger deriving (Show)

This is way easier than using the instance keyword!
While it is possible to create your own type classes in such a way that someone can use deriving to make instances of them, learning how to do so is out of the scope of this post. (Check out the Data.Derive package if you really want to know how).

Don’t fire the missiles by accident

Up to this point, all of the functions we’ve written have been referentially transparent and side-effect free. As I said before, writing useful software requires side-effects, but we don’t want to have to sacrifice the guarantees we’ve had up to this point about our code not having unexpected behaviors.

Suppose that, hypothetically, the designers of Haskell allowed side effects in some ordinary functions. Many other functional languages permit side-effects this way. Let’s imagine that an alternate reality Haskell had been designed with a rule like “All functions with side effects must start with an _”. This seems like a reasonable idea, but consider the following hypothetical example where a careless programmer copied a function from a missile silo into a theme park control system:

_fireTheMissiles :: Bool -> ()
_fireTheMissiles = ... --It's very bad if this happens by accident

--Compute how much money the theme park made today
computeTicketPriceTotal :: [Float] -> Float
computeTicketPriceTotal [] = ...
computeTicketPriceTotal t : ts =
    ...
    (_fireTheMissiles (thisTicketPrice < 5.0)) --This has a side effect!
    ...

Although the programmer defined the missile firing function using our _ rule to mark that the function has some scary side effects, the function can still be called inside another function which is supposed to be side-effect free. If the function computeTicketPriceTotal is large, the programmer might miss the fact that the _fireTheMissiles function may be evaluated, which would be bad. Even if the programmer searches all of their code for _fireTheMissiles and tries to figure out when it should and shouldn’t be called, the fact that the function can be passed as a parameter into another function makes this problem very difficult to solve.

It would be much better if the type system could prevent this from ever happening. This is exactly what the designers of Haskell built into the language.

There is a way we can have side effects and still be able to tell at a glance whether a function is pure or whether you need to worry about side effects, we can use monads.

Finally, monads

Here’s the idea behind monads. We want to be able to tell at a glance what types of side effects or unexpected behaviors might be present in a function. Furthermore, we want to ensure that whenever a side effect could have happened, it can’t get lost by being passed to another function or data.
Think of side effects as polluting your program. Anything which has been touched by a side effect or receives an input from a function which had a side effect could potentially have unexpected behavior.
Monads are prisons for side-effects; when the result of a function with a side-effect enters a monad, it can never leave. The way this is achieved in Haskell is by using a type class, called Monad.

Let’s look at Monad:

class Monad m  where
    (>>=)  :: m a -> (a -> m b) -> m b
    (>>)   :: m a -> m b -> m b
    return :: a -> m a
    fail   :: String -> m a

    m >> k =  m >>= \_ -> k
    fail s = error s

This looks a little confusing at first, so let’s break it down.

return takes a value which isn’t in the monad and puts it in the monad. For example, the following function puts the String “Elephant” into the IO monad (IO is an instance of Monad which is used to represent Input and Output side effects):

returnElephant :: IO String
returnElephant = return "Elephant"

The type of the expression return “Elephant” here is IO String.

The >>= operator, called the “bind” operator, takes a value which is in the monad, m a, and function (a -> m b) as a parameter. The function (a -> m b) takes a value which is not in the monad as an input and returns a value that is in the monad. For example, the function putStr :: String -> IO () takes a String which is pure and returns IO (). As a side effect, putStr prints the String to the standard output.

putStr "Elephant" --This prints the string "Elephant" to standard out
return "Elephant" >>= putStr --This also prints the string "Elephant" to standard out

>>= allows us to pass a value which is in the monad to a function which takes a value that isn’t in the monad as an input. Because of >>=, most functions which produce monadic values are written as if they only take pure values, significantly reducing unnecessary code duplication.
We’re using >>= as an infix operator here. >>= here takes the output of return “Elephant”, which has type IO String, and the function putStr, which has type String -> IO () as an input. The first thing that happens is that “Elephant” is put into IO and then it is bound to the input of putStr which prints it to screen.

The >> operator, which I like to call a “pass-through bind”, takes two monadic values and evaluates to a monadic value. This function is useful when you want to compute several monadic values in order, but the first monadic value is not consumed by the function which produces the second monadic value. For example, putStr and return both evaluate to IO values. If we want to call them one after another, we can use >> like this:

putStr "Elephant" >> return 10 --This evaluates to IO Integer

This prints the string “Elephant” to the standard output and then puts the integer 10 into the IO monad. The function evaluates to IO Integer because that’s the type of return 10.

fail is only used during pattern matching in a do-notation block, so we’ll cover it later.

Here’s an example of a function which takes a name and an age and prints it to the standard output:

printHappyBirthday :: String -> Int -> IO ()
printHappyBirthday name age = putStr "Happy birthday " >>
                              putStr name >>
                              putStr "! You're " >>
                              return (show age) >>=
                              putStr >>
                              putStr " now.\n"

\n is an escape character that causes the standard output to print a new line.
When we call:

printHappyBirthday "Shaun" 18

the program prints the following message to the console:

Happy birthday Shaun! You're 18 now.

Notice that, in general, there is no way to extract a value out of a monad. Also, any function which operates on a value which has been raised into a monad or computed using a monad has to include the monad meta-type’s name in its function signature. Consider this function signature:

printStrings :: [String] -> IO ()

Just by looking at it, we can tell that it produces I/O side effects.

Typically Haskell programs are split into two parts: functions with side-effects, and pure functions. It’s a good idea to limit the number of functions which deal with side-effects and test them thoroughly.

Hello world

Let’s look at an application of the IO monad:
The function putStrLn :: String -> IO () takes a String and evaluates to an IO () and as a side-effect, it prints the String to standard output.
Now we can finally make our first compiled application. Write the following into a file called hello.hs:

main = putStrLn "Hello World!"

main :: IO () is the standard entry point for a Haskell executable.
Compile the file using this command:

ghc hello.hs

Congratulations! You just made a program using Haskell called hello which prints “Hello World!” to the command line.

Continue reading Everyday Haskell.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s