# Generalizing behaviors with type classes

Often when writing code, you’ll find yourself writing functions for several data structures which have the same name and general structure. This results in a lot of code duplication, so you start thinking of ways to reduce the amount of code you’ll end up having to maintain. You may have heard of this concept being referred to as Don’t Repeat Yourself (DRY).

If you’re an imperative programmer, you start thinking about how to use the abstractions you are familiar with to deal with this problem, things like class hierarchies and generics.

Haskell has a simple way of capturing these shared behaviors in one place, using **type classes**. Let’s look at a simple type class:

type Distance = Float type Resource = Float class Vehicle v where travel :: v -> Distance -> (Distance, v) refill :: v -> Resource -> v

The type class has a parameter *v*, which represents the specific type. This type class defines the behaviors that are common to all vehicles; they travel some distance and eventually need to be refilled with some resource, like fuel or electricity, which they require to keep moving.

*travel* evaluates to a tuple with the new state of the vehicle and the distance which was traveled. *refill* just evaluates to a vehicle in its new refilled state.

Let’s look at a type that we might want to define as an instance of *Vehicle*:

data Gasoline = Gasoline Float data Efficiency = Efficiency Float data Car = Car Gasoline Efficiency

*Car* represents a car which has a tank full of gasoline and an efficiency in miles per gallon.

We can declare that *Car* behaves like a *Vehicle* by declaring that it is an instance of the *Vehicle* type class using the *instance* keyword:

instance Vehicle Car where travel (Car (Gasoline gas) (Efficiency efficiency)) distanceToTravel = (distanceTraveled, Car (Gasoline gasRemaining) (Efficiency efficiency)) where gasToSpend = distanceToTravel / efficiency gasDifference = gas - gasToSpend gasRemaining = max gasDifference 0.0 gasSpent = gasToSpend - (max (-gasDifference) 0.0) distanceTraveled = gasSpent * efficiency refill (Car (Gasoline gas) (Efficiency efficiency)) refillGas = Car (Gasoline (gas + refillGas)) (Efficiency efficiency)

The *where* expression here allows us to define local constants to be used in the function body. Simply create a list of constant names, followed by = and the constant definition. You can only use constants defined by *where* in the function body where they are defined. Local constants can be useful to avoid re-computing values by storing the result of a function call in the constant.

In order to declare that *Car* behaves like a *Vehicle*, we need to define the *travel* and *refill* functions for *Car*.

In the *travel* function, we define that the distance traveled can’t be any more than the gas in the tank will permit it to. The function also evaluates to a *Car* with some gas spent, ensuring that the gas can’t fall below zero.

We define *refill* for *Car* to simply evaluate to a *Car* with an increased amount of gas.

Since *Vehicle* represents a set of behaviors that are shared by multiple types, let’s look at another instance of *Vehicle*:

data Hamburger = Hamburger Int data Unicycle = Unicycle Hamburger instance Vehicle Unicycle where travel (Unicycle (Hamburger hamburgers)) distanceToTravel = (distanceTraveled, Unicycle (Hamburger hamburgersRemaining)) where hamburgersToSpend = round distanceToTravel hamburgersDifference = hamburgers - hamburgersToSpend hamburgersRemaining = max hamburgersDifference 0 hamburgersSpent = hamburgersToSpend - (max (-hamburgersDifference) 0) distanceTraveled = fromIntegral hamburgersSpent refill (Unicycle (Hamburger hamburgers)) hamburgersRefill = Unicycle (Hamburger (hamburgers + (round hamburgersRefill)))

It’s necessary to round down with *round* because we defined that the function should take a *Float* and evaluate to a *Float* in the type class.

Here we see a *Unicycle* which is powered by Hamburgers (the person riding the unicycle needs lots of calories to cover the distance). *travel* consumes one Hamburger per mile and *refill* only replenishes Hamburgers up to the nearest whole number.

This example demonstrates that the type instances of a type class can be completely different. They can have different value constructor type signatures and even different data, as long as they implement the interface defined by the type class.

Let’s look at how we can write a function which operates on a type class:

journey :: Vehicle v => (Distance, v) -> [Distance] -> (Distance, v) journey (distanceTraveled, vehicle) [] = (distanceTraveled, vehicle) journey (distanceTraveled, vehicle) (d : ds) = journey ((distanceTraveled + distanceTraveledNow), vehicleNow) ds where (distanceTraveledNow, vehicleNow) = travel vehicle d

You can see that there is a generic type *v* in the parameter list. Because we want to define that v is a *Vehicle*, we declare that *v* is an instance of the *Vehicle* type class in the section following the :: and before the => symbols in the function signature.

This function will work with any type that is an instance of *Vehicle*, so both of the following function evaluations are valid:

journey 0.0 (Car (Gasoline 50.0) (Efficiency 1.4)) [1.0, 2.0, 5.0, 10.0] journey 0.0 (Unicycle (Hamburger 10)) [2.0, 3.0]

You can also define a function with multiple type classes. For example, suppose we have another type class *Passenger*, we could define a function which used both. The function signature might look something like this:

rideVehicle :: (Vehicle v, Passenger p) => v -> p -> Distance

# Meta-types and type-classes

It’s possible for a meta-type to be an instance of a type class. For example, suppose we have defined a *Hybrid* meta-type that can take a type of fuel as a parameter:

data Hybrid fuel = Hybrid fuel

For now we’ll assume that it’s possible to get the efficiency of a fuel by evaluating:

efficiency fuel

We’ll also assume that there are toFloat and fromFloat functions for the fuel type.

We can make this an instance of the *Vehicle* type class in the same way that we did for *Car* and *Unicycle* above:

instance Vehicle (Hybrid f) where travel (Hybrid fuel) distanceToTravel = (distanceTraveled, Hybrid (fromFloat fuelRemaining)) where fuelToSpend = distanceToTravel / (efficiency fuel) fuelDifference = (toFloat fuel) - fuelToSpend fuelRemaining = max fuelDifference 0.0 fuelSpent = fuelToSpend - (max (-fuelDifference) 0.0) distanceTraveled = fuelSpent * (efficiency fuel) refill (Hybrid fuel) refillFuel = Hybrid (refuel fuel (fromFloat refillFuel))

There is a small problem with this *Hybrid* class, it is an instance of *Vehicle* which can take **anything** as fuel. It doesn’t make much sense to have a *Hybrid* contain *Int* or *String* as fuel. We should constrain the *Hybrid* so that it can only contain valid fuel.

The first step in constraining the fuel type is to make a type class for types which behave like fuel:

class Fuel f where fromFloat :: Float -> f toFloat :: f -> Float efficiency :: f -> Float refuel :: f -> f -> f

We’ve made it so that fuel efficiency can be measured with *efficiency* and that fuel can be refueled with *refuel*.

Let’s make an instance of this Fuel type class:

data Biofuel = Biofuel Float instance Fuel Biofuel where fromFloat f = Biofuel f toFloat (Biofuel f) = f efficiency (Biofuel f) = 45.0 refuel (Biofuel a) (Biofuel b) = (Biofuel (a + b))

Now we can use *Biofuel* as fuel for our *Hybrid*.

Let’s ensure that the fuel type parameter which we used in the instance declaration for *Hybrid* is a valid instance of *Fuel*. All we need to do is change the first line of the instance statement for *Hybrid* which we wrote before:

instance (Fuel f) => Vehicle (Hybrid f) where travel (Hybrid fuel) distanceToTravel = (distanceTraveled, Hybrid (fromFloat fuelRemaining)) where fuelToSpend = distanceToTravel / (efficiency fuel) fuelDifference = (toFloat fuel) - fuelToSpend fuelRemaining = max fuelDifference 0.0 fuelSpent = fuelToSpend - (max (-fuelDifference) 0.0) distanceTraveled = fuelSpent * (efficiency fuel) refill (Hybrid fuel) refillFuel = Hybrid (refuel fuel (fromFloat refillFuel))

The difference here is that we declare the type class of f in the section between the *instance* keyword and the => symbol.

Now we have a compile time constraint that *f* must be an instance of the type class *Fuel* if we evaluate *travel* or *refill*.

# Subclassing

Sometimes, we want to create a type class which has all of the behaviors of an existing type class but adds more. For example, suppose we want to have a type class *PassengerVehicle* which adds two more behaviors to *board* and *disembark* passengers, we could make the following type class:

class (Vehicle v) => PassengerVehicle v where board :: v -> Int -> v disembark :: v -> Int -> v

As you can see, we declare the superclass between *class* and the => symbol.

In addition to *board* and *disembark*, instances of the *PassengerVehicle* type class will have to define all of the behaviors of its superclass, *Vehicle*.

# Common type classes

Let’s look at some common type classes in Haskell.

## Eq

*Eq* represents types which can be compared to one another to determine if they are equal. *Eq* defines two behaviors, == and /=, which we’ve seen before.

class Eq a where (==), (/=) :: a -> a -> Bool x /= y = not (x == y) x == y = not (x /= y)

*Eq* demonstrates another type of constraint that can be applied to a type class. You can write Boolean constraints defining the relationship between behaviors in a type class. Here, the Haskell standard defines that == and /= are Boolean complements of one another for any pair of instances of the *Eq* class. If you define an instance of Eq, you should make sure that this is true for your instance.

Some instances of *Eq* include: *Bool*, *Int*, *Integer*, *Float*, *Double*, *Char*, *List*, and Tuples.

## Ord

*Ord* represents types which can be greater or less than one another. In other words, it is possible to order instances of this type class from least to greatest with a total ordering. Ord is a subclass of Eq, as you can see, *a* is both an Ord and an Eq in the class declaration.

This type class depends on a type called Ordering, which has five value constructors, EQ, LT, GT, LE, GE, representing equal to, less than, greater than, less than or equal to, and greater than or equal to.

class (Eq a) => Ord a where compare :: a -> a -> Ordering () :: a -> a -> Bool max, min :: a -> a -> a compare x y | x == y = EQ | x <= y = LT | otherwise = GT ...

We can see some new syntax here in the compare function. The | symbols are used to define *guards*, which we will cover in the next post.

Some instances of *Ord* include: *Bool*, *Int*, *Integer*, *Float*, *Double*, *Char*, *List*, and Tuples.

## Read and Show

*Read* and *Show* represent types which can be converted from and to *String* respectively.

class Read a where ... class Show a where show :: a -> String ...

You can convert an instance of *Show* to a *String* using the *show* function:

show 5 --This evaluates to "5", or equivalently ['5']

You can convert an instance of *Read* from a *String* into the instance type using the *read* function:

read "5" + 50 --This evaluates to the integral value 55

Some instances of *Read* and *Show* include: *Bool*, *Int*, *Integer*, *Float*, and *Double*.

## Enum

*Enum* represents types which can be ordered into a sequence.

class Enum a where succ, pred :: a -> a toEnum :: Int -> a fromEnum :: a -> Int enumFrom :: a -> [a] -- [n..] enumFromThen :: a -> a -> [a] -- [n,n'..] enumFromTo :: a -> a -> [a] -- [n..m] enumFromThenTo :: a -> a -> a -> [a] -- [n,n'..m]

You can get the predecessor of a value with the *pred* function:

pred 2 --This evaluates to 1

You can get the successor of a value with the *succ* function:

succ 'a' --This evaluates to 'b'

Some instances of *Enum* include: *Int*, *Integer*, and *Char*.

## Bounded

class Bounded a where minBound, maxBound :: a

*Bounded* represents types which have a minimum and maximum value.

Instances of *Bounded* include: *Int*, *Integer*, *Char*, *Float*, and *Double*.

## Making instances of standard type classes

It’s very easy to make your own type into an instance of a standard type class. For example, let’s make our *Unicycle* and *Hamburger* types into instances of standard types:

data Hamburger = Hamburger Int deriving (Show, Eq, Ord, Bounded) data Unicycle = Unicycle Hamburger deriving (Show)

This is way easier than using the *instance* keyword!

While it is possible to create your own type classes in such a way that someone can use *deriving* to make instances of them, learning how to do so is out of the scope of this post. (Check out the Data.Derive package if you really want to know how).

# Don’t fire the missiles by accident

Up to this point, all of the functions we’ve written have been referentially transparent and side-effect free. As I said before, writing useful software requires side-effects, but we don’t want to have to sacrifice the guarantees we’ve had up to this point about our code not having unexpected behaviors.

Suppose that, hypothetically, the designers of Haskell allowed side effects in some ordinary functions. Many other functional languages permit side-effects this way. Let’s imagine that an alternate reality Haskell had been designed with a rule like “All functions with side effects must start with an _”. This seems like a reasonable idea, but consider the following hypothetical example where a careless programmer copied a function from a missile silo into a theme park control system:

_fireTheMissiles :: Bool -> () _fireTheMissiles = ... --It's very bad if this happens by accident --Compute how much money the theme park made today computeTicketPriceTotal :: [Float] -> Float computeTicketPriceTotal [] = ... computeTicketPriceTotal t : ts = ... (_fireTheMissiles (thisTicketPrice < 5.0)) --This has a side effect! ...

Although the programmer defined the missile firing function using our _ rule to mark that the function has some scary side effects, the function can still be called inside another function which is supposed to be side-effect free. If the function computeTicketPriceTotal is large, the programmer might miss the fact that the _fireTheMissiles function may be evaluated, which would be bad. Even if the programmer searches all of their code for _fireTheMissiles and tries to figure out when it should and shouldn’t be called, the fact that the function can be passed as a parameter into another function makes this problem very difficult to solve.

It would be much better if the type system could prevent this from ever happening. This is exactly what the designers of Haskell built into the language.

There is a way we can have side effects and still be able to tell at a glance whether a function is pure or whether you need to worry about side effects, we can use **monads**.

# Finally, monads

Here’s the idea behind monads. We want to be able to tell at a glance what types of side effects or unexpected behaviors might be present in a function. Furthermore, we want to ensure that whenever a side effect could have happened, it can’t get lost by being passed to another function or data.

Think of side effects as polluting your program. Anything which has been touched by a side effect or receives an input from a function which had a side effect could potentially have unexpected behavior.

Monads are prisons for side-effects; when the result of a function with a side-effect enters a monad, it can never leave. The way this is achieved in Haskell is by using a type class, called *Monad*.

Let’s look at *Monad*:

class Monad m where (>>=) :: m a -> (a -> m b) -> m b (>>) :: m a -> m b -> m b return :: a -> m a fail :: String -> m a m >> k = m >>= \_ -> k fail s = error s

This looks a little confusing at first, so let’s break it down.

*return* takes a value which isn’t in the monad and puts it in the monad. For example, the following function puts the *String* “Elephant” into the *IO* monad (*IO* is an instance of *Monad* which is used to represent Input and Output side effects):

returnElephant :: IO String returnElephant = return "Elephant"

The type of the expression *return “Elephant”* here is *IO String*.

The >>= operator, called the “bind” operator, takes a value which is in the monad, m a, and function (a -> m b) as a parameter. The function (a -> m b) takes a value which is not in the monad as an input and returns a value that is in the monad. For example, the function *putStr :: String -> IO ()* takes a *String* which is pure and returns *IO ()*. As a side effect, *putStr* prints the *String* to the standard output.

putStr "Elephant" --This prints the string "Elephant" to standard out return "Elephant" >>= putStr --This also prints the string "Elephant" to standard out

>>= allows us to pass a value which is in the monad to a function which takes a value that isn’t in the monad as an input. Because of >>=, most functions which produce monadic values are written as if they only take pure values, significantly reducing unnecessary code duplication.

We’re using >>= as an infix operator here. >>= here takes the output of *return “Elephant”*, which has type *IO String*, and the function putStr, which has type *String -> IO ()* as an input. The first thing that happens is that “Elephant” is put into IO and then it is bound to the input of putStr which prints it to screen.

The >> operator, which I like to call a “pass-through bind”, takes two monadic values and evaluates to a monadic value. This function is useful when you want to compute several monadic values in order, but the first monadic value is not consumed by the function which produces the second monadic value. For example, *putStr* and *return* both evaluate to *IO* values. If we want to call them one after another, we can use >> like this:

putStr "Elephant" >> return 10 --This evaluates to IO Integer

This prints the string “Elephant” to the standard output and then puts the integer 10 into the *IO* monad. The function evaluates to *IO Integer* because that’s the type of *return 10*.

*fail* is only used during pattern matching in a do-notation block, so we’ll cover it later.

Here’s an example of a function which takes a name and an age and prints it to the standard output:

printHappyBirthday :: String -> Int -> IO () printHappyBirthday name age = putStr "Happy birthday " >> putStr name >> putStr "! You're " >> return (show age) >>= putStr >> putStr " now.\n"

*\n* is an escape character that causes the standard output to print a new line.

When we call:

printHappyBirthday "Shaun" 18

the program prints the following message to the console:

Happy birthday Shaun! You're 18 now.

Notice that, in general, there is no way to extract a value out of a monad. Also, any function which operates on a value which has been raised into a monad or computed using a monad has to include the monad meta-type’s name in its function signature. Consider this function signature:

printStrings :: [String] -> IO ()

Just by looking at it, we can tell that it produces I/O side effects.

Typically Haskell programs are split into two parts: functions with side-effects, and pure functions. It’s a good idea to limit the number of functions which deal with side-effects and test them thoroughly.

# Hello world

Let’s look at an application of the *IO* monad:

The function *putStrLn :: String -> IO ()* takes a *String* and evaluates to an *IO ()* and as a side-effect, it prints the *String* to standard output.

Now we can finally make our first compiled application. Write the following into a file called **hello.hs**:

main = putStrLn "Hello World!"

*main :: IO ()* is the standard entry point for a Haskell executable.

Compile the file using this command:

ghc hello.hs

Congratulations! You just made a program using Haskell called hello which prints “Hello World!” to the command line.

Continue reading Everyday Haskell.