Getting Started with Clojure on Windows

I’ve been taking a break from writing blog posts because I’ve been feeling pretty burned out the past few months, but I’m trying to get back to functional programming following my experiences at Strange Loop 2018.

I heard a lot of good things about Clojure at Strange Loop, so I’m going to start learning the language. In this post I’ll be discussing how to get Clojure working on Windows because developing for the language is only officially supported on Linux and Mac.

I’m also going to discuss how to get Clojure working on Windows Subsystem for Linux (WSL).

Clojure on Windows

If you don’t have Windows Powershell, I recommend getting that first. It’s generally a better than the Windows Command Line.

Also, you need to install Java with a version greater than 1.6 because Clojure runs on the JVM. You can check your Java version with the following command:

java -version

Getting Leiningen

In order to install Clojure on Windows, you must first install Leiningen. I tried using the lein script on the Leiningen website, but it didn’t work for me at the time of writing. Instead, I used Chocolatey to perform the installation. (More information can be found about this at https://chocolatey.org/packages/lein)

First, run Powershell as administrator by right clicking on the Powershell icon and run the following command to install Chocolatey:

Set-ExecutionPolicy Bypass -Scope Process -Force; iex ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))

Then run the Leiningen installation using Chocolatey:

choco install lein

It’s possible that the Leiningen installation may fail, in which case you can try running:

lein self-install

When I tried this on my machine, the self installer failed to download the Leiningen jar file. If this happens to you, you can download the latest jar file here: https://github.com/technomancy/leiningen/releases

Rename the standalone .zip file extension to .jar and place it in your %HOME%/.lein/self-installs/ directory.

After restarting the Powershell, you should be able to run the following command to create a Clojure project:

lein new app my-app

You can run the app by changing to the app directory and calling:

lein run

You can run a REPL using the following command:

lein repl

You can find more information about working with Leiningen here: https://www.braveclojure.com/getting-started/

Clojure on WSL

Getting Windows Subsystem for Linux (WSL)

WSL is a convenient way to cross compile for Linux if you have a Windows 10 machine.

You can find instructions for installing WSL here: https://docs.microsoft.com/en-us/windows/wsl/install-win10

Once you’ve restarted your computer, you can install Ubuntu 18.04 LTS by searching in Windows Store for Ubuntu:

UbuntuWSL

Getting Clojure on WSL

Check your Java version with:

java -version

You should have a Java version greater than 1.6 to run Clojure.

Once you’ve installed Ubuntu 18.04 LTS on WSL, you can run the following command to install Leiningen:

sudo apt-get install leiningen

and you’re ready to get started with Clojure development!

References

https://clojure.org/

https://leiningen.org/

https://chocolatey.org/

https://www.braveclojure.com/

Advertisements

Making a Text Adventure in Haskell (Part 4)

Making a change in the world

Last time I discussed how I filtered interactions the player typed in to decide if the interactions were valid given the state of the world. In this final post about my text adventure engine, I’ll describe how I updated the state of the game world and how to use the HaskellAdventure game engine to create a full text adventure game.

Performing conditional actions

The filterInteraction function calls performConditionalAction with a valid conditional action. performConditionalAction takes a list of delimiters and column width, the current scene index, end scene list, inventory, flags, a Maybe Interaction for the current scene, and a Maybe Interaction for the default scene. It evaluates to the next state of the game.

performConditionalActions :: [Char] -> Int -> SceneIndex -> [SceneIndex] -> Inventory -> Flags -> Maybe Interaction -> Maybe Interaction -> IO (Maybe (SceneIndex, Inventory, Flags))

There are five different patterns to handle based on the interactions in the input list.

The first pattern is matched when no valid interaction was found in either the current or default scene:

performConditionalActions _ _ currentSceneIndex _ inventory flags Nothing Nothing
    = putStrLn "That does nothing.\n" >>
      hFlush stdout >>
      return (Just (currentSceneIndex, inventory, flags))

If there are no valid interactions actions but the sentence was valid, I just return to the current state.

The second and third patterns form a mutual recursion, which iterates over all of the conditional actions in the current scene. They are matched whenever the current scene has at least one valid ConditionalAction. The second pattern is the terminal case in the recursion.

performConditionalActions delimiters
                          columnWidth
                          currentSceneIndex
                          endScenes
                          inventory
                          flags
                          (Just (Interaction {sentences = _,
                                              conditionalActions = []}))
                          defaultSceneInteractions
    = performConditionalActions delimiters columnWidth currentSceneIndex endScenes inventory flags Nothing defaultSceneInteractions

performConditionalActions delimiters
                          columnWidth
                          currentSceneIndex
                          endScenes
                          inventory
                          flags
                          (Just (Interaction {sentences = thisSentences,
                                              conditionalActions = (conditionalAction@(ConditionalAction {condition = thisCondition}) : remainingConditionalActions)}))
                          defaultSceneInteractions -- Ignore default scene interactions if there are still current scene interactions
    | evaluateCondition thisCondition inventory flags = updateGameState delimiters columnWidth endScenes currentSceneIndex inventory flags conditionalAction --The condition for the action passed, update the game state
    | otherwise = performConditionalActions delimiters columnWidth currentSceneIndex endScenes inventory flags
                  (Just (Interaction {sentences = thisSentences,
                                      conditionalActions = remainingConditionalActions}))

In the third pattern, I call evaluateCondition with the current ConditionalAction‘s condition and if it evaluates to True, I call updateGameState with the conditionalAction as an input. If all of the current scene’s ConditionalActions evaluate to False, the second pattern is matched and it calls the fourth and fifth patterns, which handle all of the default scene’s ConditionalActions.

The fourth and fifth patterns of performConditionalActions form an iteration over the default scene’s ConditionalActions. The fourth pattern is the terminal case of the iteration. If a ConditionalAction evaluates to True in this iteration, I call updateGameState with the ConditionalAction as an input. If no ConditionalActions evaluate to true in this iteration, all possible conditional actions have failed and the first pattern of performConditionalActions is called.

performConditionalActions delimiters
                          columnWidth
                          currentSceneIndex
                          endScenes
                          inventory
                          flags
                          Nothing
                          (Just (Interaction {sentences = _,
                                              conditionalActions = []}))
    = performConditionalActions delimiters columnWidth currentSceneIndex endScenes inventory flags Nothing Nothing

performConditionalActions delimiters
                          columnWidth
                          currentSceneIndex
                          endScenes
                          inventory
                          flags
                          Nothing --The current scene failed to have any interactions
                          (Just (Interaction {sentences = thisSentences,
                                              conditionalActions = (conditionalAction@(ConditionalAction {condition = thisCondition}) : remainingConditionalActions)}))
    | evaluateCondition thisCondition inventory flags = updateGameState delimiters columnWidth endScenes currentSceneIndex inventory flags conditionalAction
    | otherwise = performConditionalActions delimiters columnWidth currentSceneIndex endScenes inventory flags Nothing
                  (Just (Interaction {sentences = thisSentences,
                                      conditionalActions = remainingConditionalActions}))

The function which actually updates the game state is called updateGameState:

updateGameState :: [Char] -> Int -> [SceneIndex] -> SceneIndex -> Inventory -> Flags -> ConditionalAction -> IO (Maybe (SceneIndex, Inventory, Flags))
updateGameState delimiters
                columnWidth
                endScenes
                currentSceneIndex
                inventory
                flags
                conditionalAction@(ConditionalAction {conditionalDescription = thisConditionalDescription,
                                                      stateChanges = thisStateChanges})
 = printConditionalDescription delimiters columnWidth endScenes thisConditionalDescription [] (Just (currentSceneIndex, inventory, flags)) >>=
   stateChange (Data.List.find (\x -> case x of
                                      (SceneChange _) -> True
                                      otherwise -> False) thisStateChanges)
               endScenes
               thisStateChanges

updateGameState first prints the ConditionalAction‘s ConditionalDescription, then it searches for a SceneChange value constructor in the list of stateChanges. If there is a SceneChange value constructor, it passes the scene change with a Just StateChange as the first parameter of stateChange and Nothing otherwise.

stateChange takes the Maybe StateChange for scene changes, the end scenes list, the list of all other state changes, and the current state of the game. It evaluates to the next state of the game:

stateChange :: Maybe StateChange -> [SceneIndex] -> [StateChange] -> Maybe (SceneIndex, Inventory, Flags) -> IO (Maybe (SceneIndex, Inventory, Flags))

The first and second pattern of stateChange match when the game is in an end game state (the current state is Nothing). In these cases, sceneChange just evaluates to Nothing:

stateChange Nothing _ stateChanges Nothing
    = return Nothing
stateChange _ endScenes stateChanges Nothing 
    = return Nothing

The third pattern matches when the game currently is in a valid state but no SceneChange was necessary:

stateChange Nothing _ stateChanges (Just (sceneIndex, inventory, flags))
    = return (Just (sceneIndex,
                    updateInventory inventory stateChanges,
                    updateFlags flags stateChanges))

In this case, updateInventory and updateFlags are called to update the player’s inventory and flags as necessary.

The fourth pattern matches when the game is in a valid state and a SceneChange was necessary as part of the current ConditionalAction:

stateChange (Just (SceneChange nextScene)) endScenes stateChanges (Just (sceneIndex, inventory, flags)) 
    = if nextScene `elem` endScenes
      then return Nothing
      else return (Just (nextScene,
                         updateInventory inventory stateChanges,
                         updateFlags flags stateChanges))

In this case, if the nextScene is an end scene, the function evaluates to Nothing, which is the end game state. Otherwise, the function evaluates to the next scene and calls updateInventory and updateFlags to update the player’s inventory and flags.

updateInventory matches the AddToInventory and RemoveFromInventory value constructors and inserts the object string into the inventory or filters it out of the inventory as necessary:

updateInventory :: Inventory -> [StateChange] -> Inventory
updateInventory (Inventory inventory) [] = Inventory inventory
updateInventory (Inventory inventory) ((RemoveFromInventory object) : remainingChanges) = updateInventory (Inventory (filter (\x -> x /= object) inventory)) remainingChanges
updateInventory (Inventory inventory) ((AddToInventory object) : remainingChanges)
    | object `elem` inventory = updateInventory (Inventory inventory) remainingChanges
    | otherwise = updateInventory (Inventory (object : inventory)) remainingChanges
updateInventory (Inventory inventory) (_ : remainingChanges) = updateInventory (Inventory inventory) remainingChanges

updateFlags matches the SetFlag and RemoveFlag value constructors and inserts the flag string into the flags or filters it out of the flags as necessary:

updateFlags :: Flags -> [StateChange] -> Flags
updateFlags (Flags flags) [] = Flags flags
updateFlags (Flags flags) ((RemoveFlag flag) : remainingChanges) = updateFlags (Flags (filter (\x -> x /= flag) flags)) remainingChanges
updateFlags (Flags flags) ((SetFlag flag) : remainingChanges)
    | flag `elem` flags = updateFlags (Flags flags) remainingChanges
    | otherwise = updateFlags (Flags (flag : flags)) remainingChanges
updateFlags (Flags flags) (_ : remainingChanges) = updateFlags (Flags flags) remainingChanges

Since the only changes the player can make to the world are adding and removing from the inventory and flags and changing the current scene, these are the only functions necessary to update the game world.

Building an adventure

All that’s left is to cover how to actually make a game which uses the HaskellAdventure game engine. There is an example adventure in my HaskellAdventure GitHub repository called DummyAdventure.hs. Let’s walk through it one piece at a time.

The first thing which you need to do to create a game with the HaskellAdventure engine is create a module for your adventure:

module DummyAdventure (gameIntro,
                       allVerbs,
                       allNouns,
                       allPrepositions,
                       allTokens,
                       startInventory,
                       startFlags,
                       defaultScene,
                       allScenes) where

As you can see, quite a lot of functions are required to define a text adventure game.

Next you need to import the game engine and Data.List:

import qualified Data.List

import NaturalLanguageLexer
import NaturalLanguageParser
import NarrativeGraph

gameIntro simply defines a string to print at the start of your game as an introduction:

gameIntro :: String
gameIntro = "Dummy Adventure by Laurence Emms\n"

allVerbs evaluates to a list of all valid TokenVerbs in your game:

allVerbs :: [Token]
allVerbs =
 [
 TokenVerb "get" ["get", "take", "pick up"],
 TokenVerb "put" ["put", "place", "put down"],
 TokenVerb "throw" ["throw", "pitch"],

...

 TokenVerb "leave" ["leave", "exit"],
 TokenVerb "eat" ["eat", "consume"],
 TokenVerb "drink" ["drink", "consume"]
 ]

allNouns evaluates to a list of all valid TokenNouns in your game:

allNouns :: [Token]
allNouns =
 [
 TokenNoun "north" ["north"],
 TokenNoun "south" ["south"],
 TokenNoun "west" ["west"],

...

 TokenNoun "Steve" ["Steve"],
 TokenNoun "juice" ["juice"],
 TokenNoun "cake" ["cake"]
 ]

and allPrepositions evaluates to a list of all valid TokenPrepositions in your game:

allPrepositions :: [Token]
allPrepositions =
 [
 TokenPreposition "in" ["in", "inside", "within"],
 TokenPreposition "into" ["into"],
 TokenPreposition "out" ["out", "outside"],

...

 TokenPreposition "until" ["until"],
 TokenPreposition "with" ["with"],
 TokenPreposition "together with" ["together with"]
 ]

It’s useful to define the following function to create unambiguous sentences, so you don’t have to call unambiguousSentence in every Interaction:

uSentence :: [String] -> Sentence
uSentence words = unambiguousSentence allVerbs allNouns allPrepositions words

allTokens collects all tokens into a single list:

allTokens :: [Token]
allTokens = allNouns ++ allVerbs ++ allPrepositions

startInventory contains all of the items the player starts with in their inventory:

startInventory :: Inventory
startInventory = Inventory ["fork"]

startFlags contains all of the flags which are set at the start of the game:

startFlags :: Flags
startFlags = Flags ["started game"]

Next you can define all of the scenes in your game. DummyAdventure.hs contains only one scene, called scene0:

scene0 :: Scene
scene0 =
    Scene
    {
        sceneDescription =
            ConditionalDescription [(CTrue, "You're standing in a green room. The room has a <white door>.", []),
                                    (CNot (FlagSet "opened white door"), "The <white door> is closed.", []),
                                    (FlagSet "opened white door", "The <white door> is open.", []),
                                    (CNot (InInventory "key"), "There is a <key> on the floor.", [])],
        interactions =
            [
                Interaction
                {
                    sentences = [uSentence ["get", "key"]],
                    conditionalActions =
                        [
                            ConditionalAction
                            {
                                condition = CNot (InInventory "key"), --The player does not have the key
                                conditionalDescription = ConditionalDescription [(CTrue, "You pick up the <key>.", [])],
                                stateChanges = [AddToInventory "key"]
                            },
                            ConditionalAction
                            {
                                ...
                            }
                        ]
                },
                Interaction
                {
                    ...
                },
            ]
    }

Each scene has a sceneDescription, which is a ConditionalDescription which is printed when the player is in the scene. The scene also has a list of Interactions, which the player can perform in the scene. Each Interaction has a list of sentences which trigger the Interaction and a list of ConditionalActions which contain NarrativeConditions, ConditionalDescriptions, and StateChanges. By defining these for each scene, you can create a whole text adventure.

The game will need at least one end scene. In this case, scene1 is the end scene for the game:

scene1 :: Scene
scene1 =
    Scene
    {
        sceneDescription = ConditionalDescription [],
        interactions = []
    }

In this game, the end scene does nothing but end the game.

The last scene you need to define to make a game in the HaskellAdventure engine is the default scene. All Interactions in the default scene are valid in every other scene.

defaultScene :: Scene
defaultScene =
    Scene
    {
        sceneDescription = ConditionalDescription [],
        interactions =
        [
            Interaction
            {
                sentences = [uSentence ["jump"]],
                conditionalActions =
                [
                    ConditionalAction
                    {
                        condition = CTrue, --Always do this
                        conditionalDescription = ConditionalDescription [(CTrue, "You jump up and down in place.", [])],
                        stateChanges = []
                    }
                ]
            },
            Interaction
            {
                ...
            },
            ...
   }

The very last thing you need to define to make a game with the engine is the allScenes function which evaluates to a list of all scenes in the game and a list of end scene indices:

allScenes :: ([Scene], [SceneIndex])
allScenes = ([scene0, scene1], --List of scenes
             [1]) --End scenes

It has been an interesting experience developing a text adventure engine in Haskell. I learned a lot about the language and what it’s capable of during this project. I hope that also you learned something about Haskell by reading about my experiences.

Let me know if you try to use my engine to make a text adventure game or if you’ve made a text adventure in Haskell yourself.

The code for the text adventure and engine is available with an MIT open source license at https://github.com/WhatTheFunctional/HaskellAdventure.

Working with Lists

As a Haskell programmer, lists will be your bread and butter. Lists make life easier because they’re incredibly simple to process recursively. All you need to do is specify a single terminal case for the empty list [] and a recursive case by pattern matching (x : xs) and your function is guaranteed to process all of the elements of the list in order and terminate.

Because lists are so useful in Haskell, there are many helper functions built into Haskell for working with lists. We’ll cover some of these helper functions in this post.

Lists as collections

Since lists appear so frequently, it helps to know a few functions which act on lists as collections:

null :: [a] -> Bool --Evaluates to True if the list is empty
elem :: Eq a => a -> [a] -> Bool --Evaluates to True if the first parameter is an element of the second parameter
length :: [a] -> Int --Evaluates to the length of the list
reverse :: [a] -> [a] --Evaluates to the list with all of its elements in reverse order

Let’s look at some worked examples:

null [] --Evaluates to True
null [5, 6, 3] --Evaluates to False

5 `elem` [4, 2, 6] --Evaluates to False
5 `elem` [] --Evaluates to False

length [] --Evaluates to 0
length [4, 5] --Evaluates to 2

reverse [1, 2, 3] --Evaluates to [3, 2, 1]

Here are some useful functions for splitting and extracting parts of a list:

head :: [a] -> a --Evaluates to the first element of the list
tail :: [a] -> [a] --Evaluates to the list with its head removed

last :: [a] -> a --Evaluates to the last element of the list
init :: [a] -> [a] --Evaluates to the list with its last element removed

take :: Int -> [a] -> [a] --Evaluates to a list of the first N elements of the list
drop :: Int -> [a] -> [a] --Evaluates to the list with its first N elements removed
splitAt :: Int -> [a] -> ([a], [a]) --Evaluates to a tuple containing two parts of a list split at the specified location

Here are some examples of these splitting functions:

head ['b', 'r', 'x'] --Evaluates to 'b'
head [] --Throws an exception

tail [8, 3, 1, 1] --Evaluates to [3, 1, 1]
tail [] --Throws an exception

last ['3', 'd', 'p'] --Evaluates to 'p'
last [] --Throws an exception

init [6, 9, 3, 2] --Evaluates to [6, 9, 3]
init [] --Throws an exception

take 2 [5, 8, 9, 9] --Evaluates to [5, 8]
take 5 [] --Evaluates to []

drop 3 ['4', 'm', ';', 'q'] --Evaluates to ['q']
drop 7 [] --Evaluates to []

splitAt 2 [8, 8, 9, 4, 3] --Evaluates to ([8, 8], [9, 4, 3])
splitAt 4 [] --Evaluates to ([], [])

You have to import Data.list to use the following list splitting function:

import Data.List
uncons :: [a] -> Maybe (a, [a]) --Split a list into a Maybe of a tuple containing the head and tail of the list

Here’s an example of uncons:

uncons [2, 9, 4] --Evaluates to Just (2, [9, 4])
uncons [] --Evaluates to Nothing

Zip and unzip

Sometimes you want to interleave one list with another, so you can access them together at the same time. For example, we might have a list of strings and a list of floating point numbers:

["lambda", "blog", "functional"]
[8.0, 11.99, 2.50]

We could combine the two lists into a single list of tuples using the zip function:

zip ["lambda", "blog", "functional"] [8.0, 11.99, 2.50]
--Evaluates to [("lambda", 8.0), ("blog", 11.99), ("functional", 2.50)]

It’s possible to reverse this transformation using unzip:

unzip [("lambda", 8.0), ("blog", 11.99), ("functional", 2.50)]
--Evaluates to (["lambda", "blog", "functional"], [8.0, 11.99, 2.50])

There are variants of zip and unzip for lists with 3 or more elements called zip3, unzip3, zip4, unzip4, and so on.

Lists of numbers

There are several specialized functions for operating on lists of numbers:

sum :: Num a => [a] -> a --Evaluates to the sum of a list of numbers
product :: Num a => [a] -> a --Evaluates to the product of a list of numbers
minimum :: Num a => [a] -> a --Evaluates to the minimum of a list of numbers
maximum :: Num a => [a] -> a --Evaluates to the maximum of a list of numbers

For example:

sum [5, 4, 9] --Evaluates to 18
product [2, 1, 3, 8] --Evaluates to 48
minimum [8, 2, 3, 6] --Evaluates to 2
maximum [4, 2, 0, 9] --Evaluates to 9

List replication

Since lists are used frequently, it’s useful to have a way to define a large list without having to write out each element. Haskell provides several methods for defining large lists with a small amount of code:

Repeat, replicate and cycle

Sometimes you need to create a list which has the same pattern of elements repeated over and over. Haskell provides several functions to handle lists of repeated elements:

replicate creates a list containing a single element repeated a specified number of times:

replicate 5 'g' --Evaluates to ['g', 'g', 'g', 'g', 'g']

repeat creates an infinite list containing a single element repeated over and over:

repeat 'q' --Evaluates to ['q', 'q', 'q', ...]

That’s right, Haskell supports infinite lists! This is possible in Haskell because it is lazy, which means that it stores references to constants and functions without actually evaluating them until absolutely necessary. As long as we don’t do something like trying to print  or find the last element of the infinite list, Haskell will handle infinite lists the same way it handles any other list. We’ll cover Haskell’s laziness in more detail in a later post.

It’s pretty common to use take with infinite lists to evaluate a certain number of elements.

cycle takes a list and replicates it over and over infinitely:

cycle ['a', 'b', 'c'] --Evaluates to ['a', 'b', 'c', 'a', 'b', 'c', ...]

Ranges

Ranges are a convenient shorthand for declaring lists of increasing and decreasing elements. You declare a range by using .. in your list declaration. Here are some examples:

[1..5] --Evaluates to [1, 2, 3, 4, 5]
['b'..'e'] --Evaluates to ['b', 'c', 'd', 'e'] or, equivalently, "bcde"

If you specify the first two elements of a range, you can adjust the step taken between each element and create lists of decreasing elements:

[6, 5..1] --Evaluates to [6, 5, 4, 3, 2, 1]
[2, 4..8] --Evaluates to [2, 4, 6, 8]

You can also have infinite ranges. For example:

[2, 4..] --Evaluates to the list of all positive multiples of 2

Comprehensions

List comprehensions allow you to declare a list using a set of predicates. This is a very powerful technique for creating lists with interesting properties.

List comprehensions contain three parts, an expression, followed by a |, and a set of predicates:

[v | v <- [1..10], v < 5, odd v] --Evaluates to [1, 3]

The set above is the set of all elements, labeled v, where v is an element of the set [1..10] and v is less than 5 and v is odd.

The expression on the left can be any valid Haskell expression:

[if (odd v) then "odd" else "even" | v <- [1..5]]
--Evaluates to ["odd", "even", "odd", "even", "odd"]

You can have multiple variables on the right of the | too:

[v * w | v <- [1, 3..9], w <- [2, 4..10]]

The way this works is that the right-most variable varies the most quickly, when evaluating the expression. The list of elements will start with v being set to 1 and then w will be set to 2, 4, 6, 8, and 10 before v is set to 3; then the evaluation will continue with w iterating over 2, 4, 6, 8, 10 again. The end result looks like this:

[1 * 2, 1 * 4, 1 * 6, 1 * 8, 1 * 10, 3 * 2, 3 * 4, 3 * 6, 3 * 8, 3 * 10, 5 * 2, 5 * 4, 5 * 6, ...]
[2, 4, 6, 8, 10, 6, 12, 18, 24, 30, 10, 20, 30, 40, 50, 14, 28, 42, 56, 70, 18, 36, 54, 72, 90]

Functional programming with lists

While it’s pretty easy to write list processing functions in Haskell using pattern matching, you can often write more concise and readable code using functional programming with lists. Each of the following functions takes a function which operates on a single element of the list and applies it to the list as a whole to evaluate a result.

Any and all

any and all are functions which you can use to make logical statements about a list.

any takes a function which returns a Bool and evaluates to True if the function evaluates to True for any of the elements of the list and False otherwise:

any :: (a -> Bool) -> [a] -> Bool

Here are some examples of any:

any (> 10) [2, 5, 1, 8] --Evaluates to False
any even [7, 3, 1] --Evaluates to False

all takes a function which returns a Bool and evaluates to True if the function evaluates to True for all of the elements of the list and False otherwise:

all  :: (a -> Bool) -> [a] -> Bool

Here are some examples of all:

all odd [5, 5, 2] --Evaluates to False
all (/= 'b') ['q', 'y', 'w', 'b', 'm', 'm'] --Evaluates to False

Map

If you’ve ever worked with the map/reduce algorithm before, map should be very familiar to you. map takes a function which acts on an element and produces a value and applies that function to all of the elements to produce a list of resulting values:

map :: (a -> b) -> [a] -> [b]

For example:

map (+ 2) [9, 3, 5] --Evaluates to [11, 5, 7]
map (\x -> 'e' `elem` x) ["Hello", "World", "!"] --Evaluates to [True, False, False]

Fold

Haskell’s fold functions accumulate the result of a function over all of the elements of a list. This can be useful if you want to perform a reduce operation on a list. There are two variants of fold, foldl and foldr:

foldl :: (a -> b -> a) -> a -> [b] -> a
foldl f z [] = z
foldl f z (x : xs) = foldl f (f z x) xs
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f z [] = z
foldr f z (x : xs) = f x (foldr f z xs)

foldl and foldr both take a function, an initial value for the accumulator, and the list and they evaluate to an accumulated value. They apply the function to an accumulator and each element of the list to produce an accumulated value. For example:

foldl (+) 0 [2, 4, 1] --Evaluates to 7. This is equivalent to the sum function.
foldl (\accumulator x -> (accumulator && (x > 5))) True [6, 8, 7] --Evaluates to True. This is equivalent to all (> 5) [6, 8, 7].
foldr (\x accumulator -> (accumulator && (x > 5))) True [6, 8, 7] --Evaluates to True. This is equivalent to all (> 5) [6, 8, 7].

The difference between foldl and foldr is the order in which the accumulation happens. Notice how the order of the accumulator and element in the lambdas above are different for foldl and foldr. In the case of foldl, the accumulator is on the left, and in foldr, the accumulator is on the right.

Let’s see what actually happens with foldl and foldr by folding (-) on the list [1, 2, 3].

If we recursively evaluate foldl we get something like this:

foldl (-) 0 [1, 2, 3]
foldl (-) (0 - 1) [2, 3]
foldl (-) ((0 - 1) - 2) [3]
foldl (-) (((0 - 1) - 2) - 3) []
((0 - 1) - 2) - 3
((-1) - 2) - 3
(-3) - 3
-6

If we recursively evaluate foldr we get something like this:

foldr (-) 0 [1, 2, 3]
1 - (foldr (-) 0 [2, 3])
1 - (2 - (foldr (-) 0 [3]))
1 - (2 - (3 - (foldr (-) 0 [])))
1 - (2 - (3 - 0))
1 - (2 - 3)
1 - (-1)
2

As you can see, we get completely different results from foldl and foldr! You will always get different results if the accumulator function is not associative.

The fact that the accumulation is performed recursively on the right of the accumulator function, f, in foldr means that we can use foldr on infinite lists as long as the accumulator function, f, function is short-circuiting.

In other words, if the function doesn’t need to evaluate the next element in order to find the final value of the accumulator, foldr will terminate.

For example, suppose we want to check whether all values in an infinite list [1..] will be less than 3, foldr will terminate:

foldr (\x acc -> (x < 3) && acc) True [1..]
(1 < 3) && (foldr (\x acc -> (x < 3) && acc) True [2..] True && (foldr (\x acc -> (x < 3) && acc) True [2..]) --Since (True && x) == x for all x, this statement can be simplified foldr (\x acc -> (x < 3) && acc) True [2..]

(2 < 3) && (foldr (\x acc -> (x < 3) && acc) True [3..]) True && (foldr (\x acc -> (x < 3) && acc) True [3..]) foldr (\x acc -> (x < 3) && acc) True [3..]

(3 < 3) && (foldr (\x acc -> (x < 3) && acc) True [4..]) False && (foldr (\x acc -> (x < 3) && acc) True [4..])
--Since (False && x) == False for all x, this statement can short-circuit the recursion
False

Whereas foldl will continue forever because in each iteration foldl still needs to be evaluated:

foldl (\acc x -> acc && (x < 3)) True [1..] foldl (\acc x -> acc && (x < 3)) (True && (1 < 3)) True [2..] foldl (\acc x -> acc && (x < 3)) (True && (1 < 3) && (2 < 3)) True [3..] foldl (\acc x -> acc && (x < 3)) (True && (1 < 3) && (2 < 3) && (3 < 3)) True [4..]
...

There are variants of fold called foldl1 and foldr1 which use the first value of the list as the initial value of the accumulator.

foldl1 (+) [7, 2, 0] --Evaluates to 9
foldr1 (+) [7, 2, 0] --Evaluates to 9

Unfold

unfoldr builds a list from a seed value:

unfoldr :: (b -> Maybe (a, b)) -> b -> [a]
unfoldr f z = case (f z) of
              Nothing -> []
              Just (x, y) -> x : (unfoldr f y)

You need to import Data.List to use unfoldr.

unfoldr function takes a builder function, f, a seed value, z, and evaluates to a list which is built by repeatedly applying the builder function to the last element of the tuple it produces. The recursion terminates if the builder function ever evaluates to Nothing.

For example, here is the chain of evaluations for a call to unfoldr which produces the list of numbers [0..10]:

unfoldr (\x -> if x > 10 then Nothing else Just (x, x + 1)) 0

case (\x -> if x > 10 then Nothing else Just (x, x + 1)) 0 of
Nothing -> []
Just (x, y) -> x : (unfoldr (\x -> if x > 10 then Nothing else Just (x, x + 1)) y)

case (if 0 > 10 then Nothing else Just (0, 0 + 1)) of
Nothing -> []
Just (x, y) -> x : (unfoldr (\x -> if x > 10 then Nothing else Just (x, x + 1)) y)

case Just (0, 1) of
Nothing -> []
Just (x, y) -> x : (unfoldr (\x -> if x > 10 then Nothing else Just (x, x + 1)) y)

0 : (unfoldr (\x -> if x > 10 then Nothing else Just (x, x + 1)) 1)

0 : case (\x -> if x > 10 then Nothing else Just (x, x + 1)) 1 of
    Nothing -> []
    Just (x, y) -> x : (unfoldr (\x -> if x > 10 then Nothing else Just (x, x + 1)) y) 

0 : case Just (1, 2) of
    Nothing -> []
    Just (x, y) -> x : (unfoldr (\x -> if x > 10 then Nothing else Just (x, x + 1)) y)

0 : 1 : unfoldr (\x -> if x > 10 then Nothing else Just (x, x + 1)) 2

and so on…

unfoldr is called the dual of foldr because you can apply unfoldr to undo some foldr transformations.

Scan

scanl and scanr compute a similar result to foldl and foldr. Scan takes a function, an accumulator, and a list and performs a reduction but instead of just returning the final result of the reduction, it returns a list of all incremental calls to the accumulator function.

scanl :: (a -> b -> a) -> a -> [b] -> [a]
scanl f z [] = z : []
scanl f z (x : xs) = z : (scanl f (f z x) xs)

scanr :: (a -> b -> b) -> b -> [a] -> [b]
scanr f z [] = z : []
scanr f z (x : xs) = (f x (foldr f z xs)) : (scanr f z xs)

It’s important to note that the following equalities hold in general:

last (scanl f z x) == foldl f z x
head (scanr f z x) == foldr f z x

Let’s look at an example of scanl:

scanl (-) 0 [1, 2, 3]
0 : (scanl (-) (0 - 1) [2, 3])
0 : (0 - 1) : (scanl (-) ((0 - 1) - 2) [3])
0 : (0 - 1) : ((0 - 1) - 2) : (scanl (-) (((0 - 1) - 2) - 3) [])
0 : (0 - 1) : ((0 - 1) - 2) : (((0 - 1) - 2) - 3) : []
0 : (-1) : (-3) : (-6) : []
[0, -1, -3, -6]

and an example of scanr:

scanr (-) 0 [1, 2, 3]
(1 - (foldr (-) 0 [2, 3])) : (scanr (-) 0 [2, 3])
(1 - (2 - (3 - 0))) : (scanr (-) 0 [2, 3])
(1 - (2 - (3 - 0))) : (2 - (foldr (-) 0 [3])) : (scanr (-) 0 [3])
(1 - (2 - (3 - 0))) : (2 - (3 - 0)) : (scanr (-) 0 [3])
(1 - (2 - (3 - 0))) : (2 - (3 - 0)) : (3 - (foldr (-) 0 [])) : (scanr (-) 0 [])
(1 - (2 - (3 - 0))) : (2 - (3 - 0)) : (3 - 0) : (scanr (-) 0 [])
(1 - (2 - (3 - 0))) : (2 - (3 - 0)) : (3 - 0) : 0 : []
2 : (-1) : 3 : 0 : []
[2, -1, 3, 0]

Just like with foldl and foldr, there are variants of scan called scanl1 and scanr1 which use the first value of the list as the initial value of the accumulator.

Find

find takes a predicate function, f, and a list and it evaluates to Just x, where x is the first element in the list for which the predicate evaluates to True, or Nothing if no element matches the predicate:

find :: (a -> Bool) -> [a] -> Maybe a
find f [] = Nothing
find f (x : xs)
    | f x = Just x
    | otherwise = find f xs

Here’s an example of find:

find (== 5) [3, 5, 1]
find (== 5) [5, 1]
Just 5

Take-while and drop-while

takeWhile and dropWhile are functional programming versions of take and drop which take a predicate function, f, instead of a number of elements as an argument.

takeWhile :: (a -> Bool) -> [a] -> [a]
takeWhile f [] = []
takeWhile f (x : xs)
    | f x = x : (takeWhile f xs)
    | otherwise = []

dropWhile :: (a -> Bool) -> [a] -> [a]
dropWhile f [] = []
dropWhile f (x : xs)
    | not (f x) = x : dropWhile (\_ -> False) xs
    | otherwise = dropWhile f xs

takeWhile adds elements from an input list to an output list until its predicate function evaluates to False. For example:

takeWhile (< 3) [1, 2, 3, 4, 5]
1 : takeWhile (< 3) [2, 3, 4, 5]
1 : 2 : takeWhile (< 3) [3, 4, 5]
1 : 2 : []

dropWhile drops elements from an input list until its predicate function evaluates to False, after which it starts adding elements from the input list to the output list. For example:

dropWhile (< 3) [1, 2, 3, 4, 5]
dropWhile (< 3) [2, 3, 4, 5]
dropWhile (< 3) [3, 4, 5] 3 : dropWhile (\_ -> False) [4, 5]
3 : 4 : dropWhile (\_ -> False) [5]
3 : 4 : 5 : dropWhile (\_ -> False) []
3 : 4 : 5 : []
[3, 4, 5]

Filter

filter takes a predicate function and a list. It applies the predicate function to all of the elements in the input list and evaluates to a list of all of the elements which match the predicate.

filter :: (a -> Bool) -> [a] -> [a]
filter p (x : xs)
    | p x = x : filter p xs
    | otherwise = filter p xs
filter _ [] = []

filter enables you to discard a subset of the input list. Let’s look at a few examples:

filter (\x -> x == 5) [1, 5, 2, 5] --Evaluates to [5, 5]
filter even [2, 5, 1, 4, 3, 2] --Evaluates to [2, 4, 2]

Everyday Haskell

Haskell for the working programmer

Up to this point, we’ve been covering a subset of the Haskell language which would look unusual to an experienced Haskell programmer. In our quest to make side-effects, we’ve skipped over a lot of useful functionality which makes programming in Haskell more pleasant. Now that you’re able to write useful programs in Haskell, we can explore what day-to-day Haskell code looks like.

We’ll also go over some common side effects we can use in the IO monad.

Throughout this post, I’ll be including examples from a small command line application which manages the rewards program for a fictional company called Haskell’s Curry House.

Standard input and output

In order to interact with the Haskell’s Curry House Rewards Program Manager, the user will have to type commands into the command line terminal. Since reading from and writing to the command line are side-effects, we’ll need to work with a monad to perform these operations. Haskell provides the IO monad to allow us to write programs which interact with the standard input and output generated by the command line.

The following functions are used for getting standard input:

getChar :: IO Char --Get a single character
getLine :: IO String --Get a line
getContents :: IO String --Get all user input lazily
interact :: (String -> String) -> IO () --All input is passed to the function which is the first parameter of interact
readIO :: Read a => String -> IO a --Converts the String read in to a value of type a
readLn :: Read a => IO a --Converts the line read in to a value of type a

If you are using getChar or getLine in a loop, you can detect the end of input using isEOF:

isEOF :: IO Bool

An end of file (EOF) signal is typically sent with a shortcut like Ctrl-D on Linux or Ctrl-Z on Windows.

The following functions are used for printing to standard output:

putChar :: Char -> IO () --Prints a character to standard output
putStr :: String -> IO () --Prints a string to standard output
putStrLn :: String -> IO () --Prints a string to standard output followed by a newline character
print :: Show a => a -> IO () --Prints a value of type a to standard output

Notice that each of these functions takes an input which is not in the IO monad and evaluates to a value which is in the IO monad. For all output functions, they each evaluate to the null tuple () value because they don’t compute any values, they only produce side effects.

You can flush the buffered standard output with the following command. This is required to force the output to display on screen:

import System.IO --Required import for hFlush

hFlush stdout --Flush standard output
hFlush stderr --Flush standard error

Let’s combine these together into a function which will take a customer name, an menu item that the person purchased, and the reward points that they earned and print out a new record for that person.

Here’s a function which will generate a record containing the customer name, menu item ordered and points earned:

generateRecord :: IO (String, String, Int)
generateRecord = putStr "Enter customer name:\n" >>
                 hFlush stdout >>
                 getLine >>= (\name ->
                 putStr "Enter menu item:\n" >>
                 hFlush stdout >>
                 getLine >>= (\menuItem ->
                 putStr "Enter number of reward points:\n" >>
                 hFlush stdout >>
                 readLn >>= (\rewardPoints -> return (name, menuItem, rewardPoints))))

“\n” here represents a newline character.

You should recognize the >> and >>= bind operators from the previous post on monads. In this case, >> and >>= are used to compose I/O functions together.

We use >>= bind when we want to take the result of an IO operation and use it in a later function evaluation as a parameter. For example, getLine >>= (\name -> …) takes the value of IO String produced by evaluating getLine and binds it to the name parameter in the lambda which follows. The name parameter is available to all subsequently nested lambdas, as you can see in the final line of the function, which returns a tuple with each field of the record we are generating.

We use >> bind when the function we’re composing doesn’t produce any useful value. For example, putStr doesn’t produce a value which we want to use, so we discard it’s value with the pass-through bind (>>). We could also have discarded the result of putStr using regular bind by pattern matching _ in the subsequent lambda like this: putStr >>= (\_ -> …). This is an uncommon pattern however, so you should probably use >> instead of doing this.

Now let’s look at a function which will print the record to standard output:

prettyPrintRecord :: (String, String, Int) -> IO ()
prettyPrintRecord (name, menuItem, rewardPoints) = putStr "Customer: " >>
                                                   putStr name >>
                                                   putStr ", " >>
                                                   putStr menuItem >>
                                                   putStr ", " >>
                                                   print rewardPoints >>
                                                   putStr "\n"

Since this function is just a chain of putStr commands, we can just use >> bind for each printout.

We can call these two functions together in main like this using >>= bind to take the result of generateRecord, which has type IO (String, String, Int), and pass it to prettyPrintRecord, which expects a type of (String, String, Int) as an input:

main = generateRecord >>=
       prettyPrintRecord

Command line arguments

You can get the command line arguments using getArgs:

import System.Environment --Required for getArgs
getArgs :: IO [String]
getProgName :: IO String

getArgs returns a list of strings entered into the command line after the program name. getProgName returns the name of the program.

There are several libraries for parsing command line arguments in Hackage. At the time of writing, the two most popular are cmdargs and optparse-applicative.

Let’s look at an example where we print the name of the Curry House records file which is entered using a command line argument:

printRecordFileName :: [String] -> IO ()
printRecordFileName args = putStr "Record file: " >>
                           putStrLn (head args)

main = getArgs >>= (\args ->
       printRecordFileName args >>
       generateRecord >>=
       prettyPrintRecord)

head args gets the first element of the args list. Note that there is a slight problem with this code, if the user did not enter any arguments in the command line, head args will fail with an exception. We will fix this problem soon.

File I/O

Basic file I/O in Haskell is done using the withFile:

import System.IO --Required for withFile
withFile :: FilePath -> IOMode -> (Handle -> IO r) -> IO r

withFile takes a FilePath, which is just a type synonym for String, an IOMode, which controls whether a file is read from, written to, or appended to, and a function from Handle to IO r, where r is a type variable which you choose. withFile evaluates to the value of the function.

Let’s have a quick look at IOMode:

data IOMode = ReadMode | WriteMode | AppendMode | ReadWriteMode

Using IOMode, you can specify what kind of access the program should have to the file.

withFile will open a file using the IOMode, execute the function provided as an argument on the file Handle, and close the file when it finishes executing the function.

You can read the size of a file in bytes using hFileSize and truncate or extend a file to a particular size using hSetFileSize:

hFileSize :: Handle -> IO Integer --Get the file size in bytes
hSetFileSize :: Handle -> Integer -> IO () --Set the file size in bytes

You can read from a file Handle using the following functions:

hGetChar :: Handle -> IO Char --Get one character from file
hGetLine :: Handle -> IO String --Get one line from file
hGetContents :: Handle -> IO String --Get the entire contents of a file into a string

You can also peek at the next character in the file without advancing the handle using hLookAhead:

hLookAhead :: Handle -> IO Char

If you are reading using hGetChar or hGetLine, you can detect when to stop by querying whether the handle is at the end of the file using hIsEOF:

hIsEOF :: Handle -> IO Bool

You can write to a file using the following functions:

hPutChar :: Handle -> Char -> IO () --Put a character into a file
hPutStr :: Handle -> String -> IO () --Put a string into a file
hPutStrLn :: Handle -> String -> IO () --Put a string into a file followed by a newline
hPrint :: Show a => Handle -> a -> IO () --Put a value of type a into a file

Let’s take a look at an example which just prints the contents of a file:

printRecordFileContents :: String -> IO ()
printRecordFileContents filePath
    = withFile filePath ReadMode (\handle ->
          hGetContents handle >>= (\contents ->
          putStrLn contents))

It is also possible to manually open and close a file using openFile and hClose:

openFile :: FilePath -> IOMode -> IO Handle --Open a file and get its handle
hClose :: Handle -> IO () --Close a file handle

I do not recommend using this method because it is possible that the program may hold on to the file handle, locking the file, after it is done with it.

There is also support for seeking in a file and using binary file I/O in Haskell. Look here for more details: System.IO.

The Maybe and Either monads

In Haskell, every function must evaluate to a value, which means that failing to evaluate to a value is a kind of side-effect. As a result, all error handling and invalid value handling must be handled using monads. In order to deal with this side-effect, Haskell provides the Maybe and Either monads:

data Maybe a = Just a | Nothing
    deriving (Eq, Ord)

data Either a b = Left a | Right b

A value of Maybe a can have a value constructor of Just a, indicating success, or Nothing, indicating failure. Either a b, can have a value constructor of Left a or Right b, with the Left a constructor representing a failed computation.

Let’s look at an example of the Either monad by fixing the problem we had earlier with head args:

printRecordFileName :: Either String String -> IO ()
printRecordFileName (Left err) = putStrLn err
printRecordFileName (Right fileName) = putStr "Record file: " >>
                                       putStrLn fileName

getFileName :: [String] -> Either String String
getFileName [] = Left "Failed to read file name."
getFileName (fileName : args) = Right fileName

main = getArgs >>= (\args ->
       printRecordFileName (getFileName args) >>
       generateRecord >>=
       prettyPrintRecord)

If the args list is empty, it matches the first pattern of getFileName, which evaluates to a Left value constructor with an error message. If the args list is not empty, it evaluates to a Right value constructor with the file name. Then printRecordFileName can handle the invalid input by matching against Left err and printing the error message instead of throwing an exception.

Let’s have a quick look at what this might look like if it were implemented with the Maybe monad:

printRecordFileName :: Maybe String -> IO ()
printRecordFileName Nothing = putStrLn "Error: File was not found."
printRecordFileName (Just fileName) = putStr "Record file: " >>
                                      putStrLn fileName

getFileName :: [String] -> IO (Maybe String)
getFileName [] = return Nothing
getFileName (fileName : args) = return (Just fileName)

main = getArgs >>=
       getFileName >>= --Evaluates to an IO (Maybe String)
       printRecordFileName >> --Takes a Maybe String as input, evaluates to IO ()
       generateRecord >>=
       prettyPrintRecord

One interesting feature of Maybe is that you can bind a value of Maybe to a lambda or function which evaluates to a Maybe and doesn’t match a Nothing pattern. Suppose we have a function:

divide5By :: Float -> Maybe Float
divide5By 0.0 = Nothing
divide5By x = Just (5.0 / x)

divide5By 0.0 >>= (\x -> Just (x * 10.0))

Notice how there is no error handling in the lambda? This is because the >>= bind operator for the Maybe monad just discards the function it binds to if the input to >>= is Nothing. As usual with bind, this only works for functions which evaluate to a Maybe value.

In other words, if divide5By evaluates to Nothing, the lambda never evaluates.

If any one of a >>= chain of functions in the Maybe monad evaluates to Nothing, then all subsequent operations are skipped and no invalid computations are performed.

You can do the exact same thing with the Either monad but you will get more information about the cause of the error:

divide5By :: Float -> Either String Float
divide5By 0.0 = Left "Divide by Zero"
divide5By x = Right (5.0 / x)

divide5By 0.0 >>= (\x -> Right (x * 10.0))

The Maybe and Either monads allow us to handle invalid computations which could evaluate to Nothing or Left with very little error handling code.

Conditional evaluation with if-then-else and guards

In day to day Haskell, you’ll have two basic options for working with conditionals, guards and if-then-else statements.

You’ve already seen an example of conditionals using guards in Modeling Generalized Behaviors and Imprisoning Side Effects. Guards allow us to define several conditional statements which are checked from the top down and if the condition evaluates to True, the statement is executed as the function body. Let’s look at an example of a function which uses guards to print “Dessert ordered” if a customer ordered a dessert item or “Starter ordered” if they ordered a starter:

orderType :: String -> IO ()
orderType menuItem
    | menuItem == "Ice cream" || menuItem == "Gulab Jamun" = putStr "Dessert ordered"
    | menuItem == "Pakora" || menuItem == "Mini samosas" = putStr "Starter ordered"
    | otherwise = return ()

Otherwise catches all cases which fall through the guards above.

If-then-else statements are the second way you can create a conditional expression in Haskell. You can insert an if-then-else anywhere you would put another statement or value.

You can not specify an if-then without the else statement in Haskell because all statements must evaluate to a value. If there was no else branch, then there would not be a valid value for the if statement to evaluate to when that branch is taken, so an else branch must always be specified.

Let’s look at an example of if-then-else in a function which prints “VIP member” if the customer has more than 50 reward points:

printIsVIP :: Int -> IO ()
printIsVIP rewardPoints
    = if rewardPoints > 50
      then putStr "VIP member"
      else return ()

Making local constants with let and where statements

Working with only recursive function calls in a function body can become very confusing, creating extremely long lines of code. In order to make your code more readable, you can define local constants using let and where statements.

We’ve seen where statements before in Modeling Generalized Behaviors and Imprisoning Side Effects. Where statements allow us to declare constants which can be referred to anywhere within a function. For example, we might want to have a function to compute a customer’s discounted price at Haskell’s Curry House:

computeCustomerDiscount :: Int -> Int -> Float
computeCustomerDiscount price rewardPoints
    | rewardPoints > 5 = (fromIntegral price) * discountRate * tax
    | otherwise = (fromIntegral price) * tax
        where discountRate = 0.15
              tax = 0.2

We can also use a let statement to define a set of local constants in a single sub-expression. This is an example of using a let expression in a function to print a raw tuple of the customer’s record to standard output:

printCustomerRecord :: (String, String, Int) -> IO ()
printCustomerRecord (name, menuItem, rewardPoints)
    = let stringRecord = show (name, menuItem, rewardPoints) in
          putStr stringRecord

Show is used here to convert the record into a string.

Pattern matching with case statements

Sometimes it’s useful to pattern match the type of a value inside a sub-expression. This is especially common when you’re processing Maybe values. You can use a case expression to handle this kind of pattern matching without having to write an extra function. As an example, here are a couple of functions which compute the total cost of the items a customer has purchased:

findItemCost :: String -> [(String, Float)] -> Maybe Float
findItemCost _ [] = Nothing
findItemCost itemName ((item, cost) : menuItems)
    | itemName == item = (let tax = 0.2 in Just (cost * tax))
    | otherwise = findItemCost itemName menuItems

getTotalCustomerSpent :: [(String, String, Int)] -> [(String, Float)] -> Float
getTotalCustomerSpent [] _ = 0.0
getTotalCustomerSpent ((name, menuItem, rewardPoints) : records) menuItems
    = let menuItemCost = (findItemCost menuItem menuItems) in
          case menuItemCost of
          Nothing -> getTotalCustomerSpent records menuItems
          Just cost -> cost + (getTotalCustomerSpent records menuItems)

Since findItemCost evaluates to a Maybe, we have to handle the case where the menuItemCost is Nothing. We could have created a separate function to perform this pattern matching, but the case expression lets us express the pattern-matching inline, making the code much more concise.

Do notation

Up to this point, all of our monadic code has used the >> and >>= bind operators to compose operations which use monads together. There is a kind of syntactic sugar called “do notation” which you can put on monadic operations to make them look kind-of like imperative code.

I find do notation confusing. I think it obscures what’s going on behind the scenes with your monads. At least one person agrees with me, as can be seen here: Do notation considered harmful.

I do not recommend using do notation in your code; however, you will inevitably encounter it in your work. Let’s take a look at what the generateRecord function would look like using do notation:

generateRecord :: IO (String, String, Int)
generateRecord = do putStr "Enter customer name:\n"
                    hFlush stdout
                    name <- getLine
                    putStr "Enter menu item:\n"
                    hFlush stdout
                    menuItem <- getLine
                    putStr "Enter number of reward points:\n"
                    hFlush stdout
                    rewardPoints <- readLn
                    return (name, menuItem, rewardPoints)

This should look very familiar to you if you’re an imperative programmer. It looks like we are doing assignments with the <- operations. But these aren’t assignments at all! We’re actually using >>= bind operations to create constants in a nested monadic context.

If you saw this code before knowing how monads work, you would be completely unable to debug what happens when you get a type mismatch, or when you have to deal with an IO Maybe nested monad.

This do notation code is literally run through a “desugarer” in GHC to turn it back into the code we saw in the first example of generateRecord.

Even though I’m not fond of do notation, it is very common to see it used in Haskell, so you should at least know about it.

The example code used in this post is available in this repository: https://github.com/WhatTheFunctional/EverydayHaskell.

Continue reading Working with Lists.

Modeling Generalized Behaviors and Imprisoning Side Effects

Generalizing behaviors with type classes

Often when writing code, you’ll find yourself writing functions for several data structures which have the same name and general structure. This results in a lot of code duplication, so you start thinking of ways to reduce the amount of code you’ll end up having to maintain. You may have heard of this concept being referred to as Don’t Repeat Yourself (DRY).
If you’re an imperative programmer, you start thinking about how to use the abstractions you are familiar with to deal with this problem, things like class hierarchies and generics.
Haskell has a simple way of capturing these shared behaviors in one place, using type classes. Let’s look at a simple type class:

type Distance = Float
type Resource = Float

class Vehicle v where
    travel :: v -> Distance -> (Distance, v)
    refill :: v -> Resource -> v

The type class has a parameter v, which represents the specific type. This type class defines the behaviors that are common to all vehicles; they travel some distance and eventually need to be refilled with some resource, like fuel or electricity, which they require to keep moving.
travel evaluates to a tuple with the new state of the vehicle and the distance which was traveled. refill just evaluates to a vehicle in its new refilled state.

Let’s look at a type that we might want to define as an instance of Vehicle:

data Gasoline = Gasoline Float
data Efficiency = Efficiency Float

data Car = Car Gasoline Efficiency

Car represents a car which has a tank full of gasoline and an efficiency in miles per gallon.

We can declare that Car behaves like a Vehicle by declaring that it is an instance of the Vehicle type class using the instance keyword:

instance Vehicle Car where
    travel (Car (Gasoline gas) (Efficiency efficiency)) distanceToTravel =
        (distanceTraveled,
         Car (Gasoline gasRemaining) (Efficiency efficiency))
            where gasToSpend = distanceToTravel / efficiency
                  gasDifference = gas - gasToSpend
                  gasRemaining = max gasDifference 0.0
                  gasSpent = gasToSpend - (max (-gasDifference) 0.0)
                  distanceTraveled = gasSpent * efficiency
    refill (Car (Gasoline gas) (Efficiency efficiency)) refillGas = Car (Gasoline (gas + refillGas)) (Efficiency efficiency)

The where expression here allows us to define local constants to be used in the function body. Simply create a list of constant names, followed by = and the constant definition. You can only use constants defined by where in the function body where they are defined. Local constants can be useful to avoid re-computing values by storing the result of a function call in the constant.
In order to declare that Car behaves like a Vehicle, we need to define the travel and refill functions for Car.
In the travel function, we define that the distance traveled can’t be any more than the gas in the tank will permit it to. The function also evaluates to a Car with some gas spent, ensuring that the gas can’t fall below zero.
We define refill for Car to simply evaluate to a Car with an increased amount of gas.

Since Vehicle represents a set of behaviors that are shared by multiple types, let’s look at another instance of Vehicle:

data Hamburger = Hamburger Int

data Unicycle = Unicycle Hamburger

instance Vehicle Unicycle where
    travel (Unicycle (Hamburger hamburgers)) distanceToTravel =
        (distanceTraveled,
         Unicycle (Hamburger hamburgersRemaining))
            where hamburgersToSpend = round distanceToTravel
                  hamburgersDifference = hamburgers - hamburgersToSpend
                  hamburgersRemaining = max hamburgersDifference 0
                  hamburgersSpent = hamburgersToSpend - (max (-hamburgersDifference) 0)
                  distanceTraveled = fromIntegral hamburgersSpent
    refill (Unicycle (Hamburger hamburgers)) hamburgersRefill = Unicycle (Hamburger (hamburgers + (round hamburgersRefill)))

It’s necessary to round down with round because we defined that the function should take a Float and evaluate to a Float in the type class.
Here we see a Unicycle which is powered by Hamburgers (the person riding the unicycle needs lots of calories to cover the distance). travel consumes one Hamburger per mile and refill only replenishes Hamburgers up to the nearest whole number.
This example demonstrates that the type instances of a type class can be completely different. They can have different value constructor type signatures and even different data, as long as they implement the interface defined by the type class.

Let’s look at how we can write a function which operates on a type class:

journey :: Vehicle v => (Distance, v) -> [Distance] -> (Distance, v)
journey (distanceTraveled, vehicle) [] = (distanceTraveled, vehicle)
journey (distanceTraveled, vehicle) (d : ds) =
    journey ((distanceTraveled + distanceTraveledNow), vehicleNow) ds
        where (distanceTraveledNow, vehicleNow) = travel vehicle d

You can see that there is a generic type v in the parameter list. Because we want to define that v is a Vehicle, we declare that v is an instance of the Vehicle type class in the section following the :: and before the => symbols in the function signature.
This function will work with any type that is an instance of Vehicle, so both of the following function evaluations are valid:

journey 0.0 (Car (Gasoline 50.0) (Efficiency 1.4)) [1.0, 2.0, 5.0, 10.0]
journey 0.0 (Unicycle (Hamburger 10)) [2.0, 3.0]

You can also define a function with multiple type classes. For example, suppose we have another type class Passenger, we could define a function which used both. The function signature might look something like this:

rideVehicle :: (Vehicle v, Passenger p) => v -> p -> Distance

Meta-types and type-classes

It’s possible for a meta-type to be an instance of a type class. For example, suppose we have defined a Hybrid meta-type that can take a type of fuel as a parameter:

data Hybrid fuel = Hybrid fuel

For now we’ll assume that it’s possible to get the efficiency of a fuel by evaluating:

efficiency fuel

We’ll also assume that there are toFloat and fromFloat functions for the fuel type.
We can make this an instance of the Vehicle type class in the same way that we did for Car and Unicycle above:

instance Vehicle (Hybrid f) where
    travel (Hybrid fuel) distanceToTravel =
        (distanceTraveled,
         Hybrid (fromFloat fuelRemaining))
            where fuelToSpend = distanceToTravel / (efficiency fuel)
                  fuelDifference = (toFloat fuel) - fuelToSpend
                  fuelRemaining = max fuelDifference 0.0
                  fuelSpent = fuelToSpend - (max (-fuelDifference) 0.0)
                  distanceTraveled = fuelSpent * (efficiency fuel)
    refill (Hybrid fuel) refillFuel = Hybrid (refuel fuel (fromFloat refillFuel))

There is a small problem with this Hybrid class, it is an instance of Vehicle which can take anything as fuel. It doesn’t make much sense to have a Hybrid contain Int or String as fuel. We should constrain the Hybrid so that it can only contain valid fuel.

The first step in constraining the fuel type is to make a type class for types which behave like fuel:

class Fuel f where
    fromFloat :: Float -> f
    toFloat :: f -> Float
    efficiency :: f -> Float
    refuel :: f -> f -> f

We’ve made it so that fuel efficiency can be measured with efficiency and that fuel can be refueled with refuel.

Let’s make an instance of this Fuel type class:

data Biofuel = Biofuel Float

instance Fuel Biofuel where
    fromFloat f = Biofuel f
    toFloat (Biofuel f) = f
    efficiency (Biofuel f) = 45.0
    refuel (Biofuel a) (Biofuel b) = (Biofuel (a + b))

Now we can use Biofuel as fuel for our Hybrid.

Let’s ensure that the fuel type parameter which we used in the instance declaration for Hybrid is a valid instance of Fuel. All we need to do is change the first line of the instance statement for Hybrid which we wrote before:

instance (Fuel f) => Vehicle (Hybrid f) where
    travel (Hybrid fuel) distanceToTravel =
        (distanceTraveled,
         Hybrid (fromFloat fuelRemaining))
            where fuelToSpend = distanceToTravel / (efficiency fuel)
                  fuelDifference = (toFloat fuel) - fuelToSpend
                  fuelRemaining = max fuelDifference 0.0
                  fuelSpent = fuelToSpend - (max (-fuelDifference) 0.0)
                  distanceTraveled = fuelSpent * (efficiency fuel)
    refill (Hybrid fuel) refillFuel = Hybrid (refuel fuel (fromFloat refillFuel))

The difference here is that we declare the type class of f in the section between the instance keyword and the => symbol.
Now we have a compile time constraint that f must be an instance of the type class Fuel if we evaluate travel or refill.

Subclassing

Sometimes, we want to create a type class which has all of the behaviors of an existing type class but adds more. For example, suppose we want to have a type class PassengerVehicle which adds two more behaviors to board and disembark passengers, we could make the following type class:

class (Vehicle v) => PassengerVehicle v where
    board :: v -> Int -> v
    disembark :: v -> Int -> v

As you can see, we declare the superclass between class and the => symbol.
In addition to board and disembark, instances of the PassengerVehicle type class will have to define all of the behaviors of its superclass, Vehicle.

Common type classes

Let’s look at some common type classes in Haskell.

Eq

Eq represents types which can be compared to one another to determine if they are equal. Eq defines two behaviors, == and /=, which we’ve seen before.

class  Eq a  where
        (==), (/=)  ::  a -> a -> Bool
        x /= y  = not (x == y)
        x == y  = not (x /= y)

Eq demonstrates another type of constraint that can be applied to a type class. You can write Boolean constraints defining the relationship between behaviors in a type class. Here, the Haskell standard defines that == and /= are Boolean complements of one another for any pair of instances of the Eq class. If you define an instance of Eq, you should make sure that this is true for your instance.
Some instances of Eq include: Bool, Int, Integer, Float, Double, Char, List, and Tuples.

Ord

Ord represents types which can be greater or less than one another. In other words, it is possible to order instances of this type class from least to greatest with a total ordering. Ord is a subclass of Eq, as you can see, a is both an Ord and an Eq in the class declaration.
This type class depends on a type called Ordering, which has five value constructors, EQ, LT, GT, LE, GE, representing equal to, less than, greater than, less than or equal to, and greater than or equal to.

class  (Eq a) => Ord a  where
    compare              :: a -> a -> Ordering
    () :: a -> a -> Bool
    max, min             :: a -> a -> a
    compare x y | x == y    = EQ
                | x <= y    = LT
                | otherwise = GT
...

We can see some new syntax here in the compare function. The | symbols are used to define guards, which we will cover in the next post.

Some instances of Ord include: Bool, Int, Integer, Float, Double, Char, List, and Tuples.

Read and Show

Read and Show represent types which can be converted from and to String respectively.

class  Read a  where
...

class  Show a  where
    show      :: a -> String 
...

You can convert an instance of Show to a String using the show function:

show 5 --This evaluates to "5", or equivalently ['5']

You can convert an instance of Read from a String into the instance type using the read function:

read "5" + 50 --This evaluates to the integral value 55

Some instances of Read and Show include: Bool, Int, Integer, Float, and Double.

Enum

Enum represents types which can be ordered into a sequence.

class  Enum a  where
    succ, pred     :: a -> a
    toEnum         :: Int -> a
    fromEnum       :: a -> Int
    enumFrom       :: a -> [a]            -- [n..]
    enumFromThen   :: a -> a -> [a]       -- [n,n'..]
    enumFromTo     :: a -> a -> [a]       -- [n..m]
    enumFromThenTo :: a -> a -> a -> [a]  -- [n,n'..m]

You can get the predecessor of a value with the pred function:

pred 2 --This evaluates to 1

You can get the successor of a value with the succ function:

succ 'a' --This evaluates to 'b'

Some instances of Enum include: Int, Integer, and Char.

Bounded

class  Bounded a  where
    minBound, maxBound :: a

Bounded represents types which have a minimum and maximum value.
Instances of Bounded include: Int, Integer, Char, Float, and Double.

Making instances of standard type classes

It’s very easy to make your own type into an instance of a standard type class. For example, let’s make our Unicycle and Hamburger types into instances of standard types:

data Hamburger = Hamburger Int deriving (Show, Eq, Ord, Bounded)

data Unicycle = Unicycle Hamburger deriving (Show)

This is way easier than using the instance keyword!
While it is possible to create your own type classes in such a way that someone can use deriving to make instances of them, learning how to do so is out of the scope of this post. (Check out the Data.Derive package if you really want to know how).

Don’t fire the missiles by accident

Up to this point, all of the functions we’ve written have been referentially transparent and side-effect free. As I said before, writing useful software requires side-effects, but we don’t want to have to sacrifice the guarantees we’ve had up to this point about our code not having unexpected behaviors.

Suppose that, hypothetically, the designers of Haskell allowed side effects in some ordinary functions. Many other functional languages permit side-effects this way. Let’s imagine that an alternate reality Haskell had been designed with a rule like “All functions with side effects must start with an _”. This seems like a reasonable idea, but consider the following hypothetical example where a careless programmer copied a function from a missile silo into a theme park control system:

_fireTheMissiles :: Bool -> ()
_fireTheMissiles = ... --It's very bad if this happens by accident

--Compute how much money the theme park made today
computeTicketPriceTotal :: [Float] -> Float
computeTicketPriceTotal [] = ...
computeTicketPriceTotal t : ts =
    ...
    (_fireTheMissiles (thisTicketPrice < 5.0)) --This has a side effect!
    ...

Although the programmer defined the missile firing function using our _ rule to mark that the function has some scary side effects, the function can still be called inside another function which is supposed to be side-effect free. If the function computeTicketPriceTotal is large, the programmer might miss the fact that the _fireTheMissiles function may be evaluated, which would be bad. Even if the programmer searches all of their code for _fireTheMissiles and tries to figure out when it should and shouldn’t be called, the fact that the function can be passed as a parameter into another function makes this problem very difficult to solve.

It would be much better if the type system could prevent this from ever happening. This is exactly what the designers of Haskell built into the language.

There is a way we can have side effects and still be able to tell at a glance whether a function is pure or whether you need to worry about side effects, we can use monads.

Finally, monads

Here’s the idea behind monads. We want to be able to tell at a glance what types of side effects or unexpected behaviors might be present in a function. Furthermore, we want to ensure that whenever a side effect could have happened, it can’t get lost by being passed to another function or data.
Think of side effects as polluting your program. Anything which has been touched by a side effect or receives an input from a function which had a side effect could potentially have unexpected behavior.
Monads are prisons for side-effects; when the result of a function with a side-effect enters a monad, it can never leave. The way this is achieved in Haskell is by using a type class, called Monad.

Let’s look at Monad:

class Monad m  where
    (>>=)  :: m a -> (a -> m b) -> m b
    (>>)   :: m a -> m b -> m b
    return :: a -> m a
    fail   :: String -> m a

    m >> k =  m >>= \_ -> k
    fail s = error s

This looks a little confusing at first, so let’s break it down.

return takes a value which isn’t in the monad and puts it in the monad. For example, the following function puts the String “Elephant” into the IO monad (IO is an instance of Monad which is used to represent Input and Output side effects):

returnElephant :: IO String
returnElephant = return "Elephant"

The type of the expression return “Elephant” here is IO String.

The >>= operator, called the “bind” operator, takes a value which is in the monad, m a, and function (a -> m b) as a parameter. The function (a -> m b) takes a value which is not in the monad as an input and returns a value that is in the monad. For example, the function putStr :: String -> IO () takes a String which is pure and returns IO (). As a side effect, putStr prints the String to the standard output.

putStr "Elephant" --This prints the string "Elephant" to standard out
return "Elephant" >>= putStr --This also prints the string "Elephant" to standard out

>>= allows us to pass a value which is in the monad to a function which takes a value that isn’t in the monad as an input. Because of >>=, most functions which produce monadic values are written as if they only take pure values, significantly reducing unnecessary code duplication.
We’re using >>= as an infix operator here. >>= here takes the output of return “Elephant”, which has type IO String, and the function putStr, which has type String -> IO () as an input. The first thing that happens is that “Elephant” is put into IO and then it is bound to the input of putStr which prints it to screen.

The >> operator, which I like to call a “pass-through bind”, takes two monadic values and evaluates to a monadic value. This function is useful when you want to compute several monadic values in order, but the first monadic value is not consumed by the function which produces the second monadic value. For example, putStr and return both evaluate to IO values. If we want to call them one after another, we can use >> like this:

putStr "Elephant" >> return 10 --This evaluates to IO Integer

This prints the string “Elephant” to the standard output and then puts the integer 10 into the IO monad. The function evaluates to IO Integer because that’s the type of return 10.

fail is only used during pattern matching in a do-notation block, so we’ll cover it later.

Here’s an example of a function which takes a name and an age and prints it to the standard output:

printHappyBirthday :: String -> Int -> IO ()
printHappyBirthday name age = putStr "Happy birthday " >>
                              putStr name >>
                              putStr "! You're " >>
                              return (show age) >>=
                              putStr >>
                              putStr " now.\n"

\n is an escape character that causes the standard output to print a new line.
When we call:

printHappyBirthday "Shaun" 18

the program prints the following message to the console:

Happy birthday Shaun! You're 18 now.

Notice that, in general, there is no way to extract a value out of a monad. Also, any function which operates on a value which has been raised into a monad or computed using a monad has to include the monad meta-type’s name in its function signature. Consider this function signature:

printStrings :: [String] -> IO ()

Just by looking at it, we can tell that it produces I/O side effects.

Typically Haskell programs are split into two parts: functions with side-effects, and pure functions. It’s a good idea to limit the number of functions which deal with side-effects and test them thoroughly.

Hello world

Let’s look at an application of the IO monad:
The function putStrLn :: String -> IO () takes a String and evaluates to an IO () and as a side-effect, it prints the String to standard output.
Now we can finally make our first compiled application. Write the following into a file called hello.hs:

main = putStrLn "Hello World!"

main :: IO () is the standard entry point for a Haskell executable.
Compile the file using this command:

ghc hello.hs

Congratulations! You just made a program using Haskell called hello which prints “Hello World!” to the command line.

Continue reading Everyday Haskell.

The Type Language

Types are different in Haskell

Types are the foundation upon which Haskell is built. Haskell uses a type system which is based on Algebraic Data Types (ADTs) instead of classes and templates. ADTs are so important in Haskell that almost none of its other features could exist without them.
Unlike type systems in most imperative languages, you don’t build a hierarchy of types in Haskell. Instead you declare specific types and then declare them as instances of type classes (We’ll cover type classes in my next post).

A simple type

The best way to learn how types in Haskell work is to dive right in and make some. Let’s create a type called PizzaTopping to represent all of the kinds of Pizza topping you can order at a Pizza place:

data PizzaTopping = Mushroom | BellPepper | Salami | Anchovy | Pepperoni

The list of words separated by | symbols after the = sign are called value constructors and represent the values that an instance of PizzaTopping can take. Type names and value constructors in Haskell must start with a capital letter. These value constructors are actually functions which evaluate to a PizzaTopping (similar to the factory idiom in OOP). For example, here is the type signature of the Mushroom value constructor:

Mushroom :: PizzaTopping

Mushroom takes no parameters and evaluates to a PizzaTopping. Let’s write a function that takes a PizzaTopping as a parameter to see how pattern matching works with ADTs. This function evaluates to True if the Pizza topping is vegan and False if it isn’t:

isVegan :: PizzaTopping -> Bool
isVegan Mushroom = True
isVegan BellPepper = True
isVegan Salami = False
isVegan Anchovy = False
isVegan Pepperoni = False

Then we can evaluate the function like this:

isVegan BellPepper --This will evaluate to True

As you can see, you can use pattern matching to determine which value constructor was used to create the PizzaTopping.
As a side note, since we didn’t match _ here, your program will throw an exception if you don’t evaluate the function with one of the valid value constructors.

Compound types

Since value constructors are functions, they can have parameters! So we can build a type from a set of other sub-types by adding parameters to the type’s value construtors. For example, let’s define a type for representing the tire pressure of a vehicle:

data TirePressure = MotorcyclePressure Float Float | CarPressure Float Float Float Float

Notice that we are able to store pressure readings for both motorcycles and cars using a single type. You can extract the parameters from a TirePressure value by pattern matching. Let’s create a function which takes a TirePressure value and an amount of pressure to add to all of the tires on a vehicle and then evaluates to the new tire pressures:

fillTires :: TirePressure -> Float -> TirePressure
fillTires (MotorcyclePressure a b) p = MotorcyclePressure (a + p) (b + p)
fillTires (CarPressure a b c d) p = CarPressure (a + p) (b + p) (c + p) (d + p)

and we can evaluate this function like this:

fillTires (MotorcyclePressure 10.0 9.0) 2.0 --This evaluates to (MotorcyclePressure 12.0 11.0)

Recursive types

Yes, you read that right. Haskell lets you define data types recursively by referring to themselves in their value constructors. This turns out to be very useful for creating complex data structures. Consider the following type, which represents a Pizza:

data Pizza = PizzaBaseAndSauce | WithTopping PizzaTopping Pizza

You can see that we can create a plain pizza using the PizzaBaseAndSauce value constructor:

PizzaBaseAndSauce --This is a valid Pizza

But we can also create a Pizza with a topping using the WithTopping value constructor. In order to do this, we have to pass a valid Pizza into the second parameter recursively. Let’s make a Pizza with Salami:

WithTopping Salami PizzaBaseAndSauce

Since PizzaBaseAndSauce is the only non-recursive value constructor, we have to use it as the base type of the recursion.
Let’s make one more Pizza with Mushroom and Anchovy:

WithTopping Mushroom (WithTopping Anchovy PizzaBaseAndSauce) --This still has a type Pizza

You can operate on recursive data types using recursive functions and pattern matching. Let’s make a function which will count the number of toppings on a Pizza:

countToppings :: Pizza -> Int
countToppings PizzaBaseAndSauce = 0
countToppings (WithTopping _ ts) = 1 + (countToppings ts)

Remember how _ matches with everything? In this case the _ symbol is used to match a parameter that we are not interested in to discard it. We use “ts” to pattern match the rest of the pizza toppings to be processed. It’s common in Haskell to add an “s” to your parameter for the rest of a recursive data structure to indicate that it’s a plural. For example, if you have a parameter “x” then the rest of your recursive data structure would be pattern matched as “xs”.
Here’s how you can evaluate the function on a plain pizza:

countToppings PizzaBaseAndSauce --This evaluates to 0

and here’s how you can count the number of toppings on a pizza with bell peppers and mushrooms:

countToppings (WithTopping BellPepper (WithTopping Mushroom PizzaBaseAndSauce)) --This evaluates to 2

Recursive data types are very commonly used for collections in Haskell.

Meta-types

Haskell lets you define types that are derived from other types by adding parameters to your type declarations. For example, let’s make a type called Box, which could be empty or could contain something:

data Box a = EmptyBox | BoxWith a

You can create an empty box like this:

EmptyBox --This has the type Box a, where a is undefined

and a box with an Int in it like this:

BoxWith 'a' --This has the type Box Char

We can write functions that take meta-types as parameters:

boxIsEmpty :: Box a -> Bool
boxIsEmpty EmptyBox = True
boxIsEmpty (BoxWith _) = False

Note that it’s impossible to write a function which evaluates to whatever is in the box, because you can’t define a result type which is valid in both the case where the box is empty and where it’s full.
A function in Haskell must always evaluate to a value of its return type. When programming in Haskell, we consider a failure to evaluate to a value of the type in the function signature to be a side-effect. I’ll explain how to get around this problem in my next post on monads.
The concept of meta-types (also known as parameterized types) may take some time to get used to if you haven’t worked with generic types in another language, but it’s a very powerful feature of Haskell.

Some common ADTs

Let’s look at some built in ADTs that you can use in Haskell.

Primitive types

The following primitive types are built in to Haskell. They aren’t all ADTs, but let’s look at what their definitions would look like if they were.

data Bool = False | True
data Char = 'a' | 'b' | ...

There are two types of integers: Int, which represents a fixed precision integer, and Integer which represents an arbitrary precision integer.

data Int = -2^29 | ... | -2 | -1 | 0 | 1 | 2 | ... | 2^29-1
data Integer = ... | -3 | -2 | -1 | 0 | 1 | 2 | 3 | ... 

There are two types of floating point numbers: Float, which represents a single precision IEEE 754 floating point number, and Double, which represents a double precision IEEE 754 floating point number.

data Float = -maxValue | ... | -2.0 | ... | -1.0 | ... | 0.0 | ... | 1.0 | ... | 2.0 | ... | maxValue 
data Double = -maxValue | ... | -2.0 | ... | -1.0 | ... | 0.0 | ... | 1.0 | ... | 2.0 | ... | maxValue 

List, the ADT poster child

Lists are a built-in type in Haskell, and they’re a great example of a recursive ADT. Lists in Haskell are defined as follows:

data  [a]  =  [] | a : [a]

Lists are defined in a similar way to Pizza above, with a base value constructor called [] and a recursive definition with elements separated by : characters. All elements of a list must have the same type. The following are all valid lists:

[]
'a' : ('b' : [])
1 : (2 : [])

Since Lists are a special ADT, you can leave out the nested parentheses when defining them:

1 : 2 : 3 : [] == 1 : (2 : (3 : [])) --This evaluates to True

Lists also have a special syntax to make it easier to define a list. You can write a list as a comma separated list of elements surrounded by square brackets. For example:

[1, 2] == 1 : 2 : [] --This evaluates to True

You can append an element to a list using the : operator:

1 : [2, 3] == [1, 2, 3] --This evaluates to True

Lists can be concatenated using the ++ operator, although this is a linear time operation:

[1, 2] ++ [3, 4] == [1, 2, 3, 4] --This evaluates to True

You can pattern match against lists in functions like any other recursive ADT. The following function computes the sum of the elements in a list of Ints:

sumInts :: [Int] -> Int
sumInts [] = 0
sumInts (x : xs) = x + (sumInts xs)

sumInts [1, 2, 3] --This evaluates to 6

You can also get the name of the whole list in pattern matching like this:

intFunction xs@(x : xt) = ...

Here, xs matches the whole list, including the first element, x, and the tail of the list, xt.

Some other ADTs

Let’s consider some other useful ADTs you can use which are not built-in to Haskell.

Enumerations

We’ve already seen an example of a useful ADT pattern when we created PizzaTopping. We can create a type which enumerates a set of potential symbols by specifying a data type which doesn’t have any parameterized value constructors:

data Colors = Red | Green | Blue

Trees

It’s pretty simple to define trees by using a recursive ADT. Let’s look at a type definition for a binary tree:

data BinaryTree a = EmptyNode | TreeNode a (BinaryTree a) (BinaryTree a)

We can create a binary tree like this:

TreeNode 10 (TreeNode 5 EmptyNode EmptyNode) (TreeNode 15 EmptyNode EmptyNode)

Maps

We can create a simple map using a list of key/value pairs:

data ListMap k v = ListMapEnd | ListMapJoin k v (ListMap k v)

or alternatively, a binary search tree of key/value pairs:

data TreeMap k v = TreeMapEmpty | TreeMapNode k v (TreeMap k v) (TreeMap k v)

Then we can implement search for a key and return a value (we’re using a list here as a result type so that the function can evaluate to an empty result, later we’ll learn about how to use a special monad for this):

--Function to test if the key matches the current ListMap node's key and return the value if it does
matchKey :: (Eq k) => Bool -> (ListMap k v) -> k -> [v]
matchKey True (ListMapJoin key value map) _ = [value]
matchKey False ListMapEnd _ = []
matchKey False (ListMapJoin key value map) searchKey = findKey map searchKey

--Function to find the value associated with a key in the ListMap
findKey :: (Eq k) => (ListMap k v) -> k -> [v]
findKey ListMapEnd _ = []
findKey (ListMapJoin key value map) searchKey = matchKey (key == searchKey) (ListMapJoin key value map) searchKey

The (Eq key) => syntax here is necessary because we are using the == operator with a generic parameter. We’ll cover this when we cover type classes in the next post.
If the TreeMap is constructed as a binary search tree, it can be searched in a similar way.
Implementing a hash-table based map in Haskell is more complicated and requires mutable data structures and side-effects, which we’ll cover soon.

Non-empty lists

We can define a list which must have at least one element like this:

data NEList a = NERoot a | NEJoin a (NEList a)

Then we can create a single element non-empty list as follows:

NERoot 5

or a multi-element non-empty list like this:

NEJoin 15 (NEJoin 10 (NERoot 5))

Because of how we defined our non-empty list type, it’s actually impossible to even define an empty NEList.
Notice how we are able to use Haskell’s powerful type system to enforce a constraint, that the list must be non-empty, at compile time? This is one of my favorite examples of how useful Haskell’s type system can be.

Mutually recursive types

Say we want to have a special list that alternates between storing Ints and Chars, we can define one using mutual recursion:

data IntCharList = NilInt | IntCharJoin Int CharIntList
data CharIntList = NilChar | CharIntJoin Char IntCharList

The following expression is valid:

IntCharJoin 5 (CharIntJoin 'a' NilInt)

but the following expression will result in a compilation error:

IntCharJoin 5 (IntCharJoin 10 NilInt)

Graphs?

The next obvious ADT to cover is graphs. It turns out that representing graphs requires a more in-depth understanding of Haskell’s laziness. We’ll cover graphs in a later post after we cover lazy evaluation. Instead, let’s keep our eyes on the prize and focus on getting to the point where we can make side-effects.

Miscellaneous types

There are a couple of other types in Haskell which don’t fit the standard ADT definition. I’ll mention them here because you’re sure to encounter them when you code.

Tuples

Tuples are special data types in Haskell. They are declared as a comma separated list of types in parentheses. Tuples can contain multiple different sub-types. For example, here is a tuple containing an Int and two Chars:

(1, 'a', '2')

Each tuple is its own value constructor. The above example has the type:

(1, 'a', '2') :: (Int, Char, Char)

The empty tuple (), also called the Unit data type, is used almost everywhere to represent nil, or an empty result. It’s defined as follows:

data () = ()

A 2-tuple is called a Pair.
You can access the first element of a Pair by calling the fst function:

fst (1, 'a') --This evaluates to 1

You can access the second element of a Pair by calling the snd function:

snd (1, 'a') --This evaluates to 'a'

Type synonyms

We can define a synonym for a type by doing the following:

type Miles = Float

This is useful for quickly defining new types but you need to be careful with types matching unexpectedly. For example, consider the following:

type Fahrenheit = Float
type Celsius = Float

freezingPointImperial :: Fahrenheit 
freezingPointImperial = 32.0

freezingPointSI :: Celsius
freezingPointSI = 0.0

itIsFreezing :: Fahrenheit -> Bool
itIsFreezing f = (f <= 32.0)

Because type synonyms are just references to other types, you can make an erroneous evaluation like this by passing a Celsius value into a function which expects Fahrenheit as a parameter:

itIsFreezing freezingPointSI

If we had used the data keyword to define these types, it would have resulted in a compile time error, preventing this mistake.
You can also declare a synonym for a meta type constructor, like this:

type IntBox = Box Integer

or declare parameterized type synonyms:

type Container a = Box a

Since no type synonyms have constructors, these are really only useful for making function parameters more readable. For example instead of writing:

incrementBox :: Box Integer -> Box Integer

we can write:

incrementBox :: BoxInt -> BoxInt

The String synonym

Strings in Haskell are defined as follows:

type String = [Char]

String literals can be written in double quotes for convenience.

"Hi there!" --This is equivalent to ['H', 'i', ' ', 't', 'h', 'e', 'r', 'e', '!']

Continue reading Modeling Generalized Behaviors and Imprisoning Side Effects.

Getting Started with Haskell

The journey to functional

I’m going to take a different approach from most other beginner resources to teaching you how to learn Haskell. I’m going to take you on a journey from imperative to pure functional as quickly as I can and then I’ll cover how to make side-effects. Eventually we’ll work our way to using the interpreter and the infamous do-notation.

If you haven’t read the introduction to this blog series yet, you can read it at Haskell for the Imperative.

The basics

All of the basic arithmetic operations work the same way in Haskell as they do in many imperative languages. The following are all valid Haskell expressions:

2 + 5
x * y / 5.0
(2 + foo) * 3

The letters and words here represent named constants, which always start with a lower case letter. There are no variables in Haskell, only constants. If Haskell had variables, they would be mutable, which could break the rule of referential transparency.

For an imperative programmer, the lack of mutable variables seems very strange. You may wonder how it is possible to compute something without a mutable variable. I will explain how in the next section.

One quirk of Haskell’s syntax is that the negation operator has to be surrounded by parentheses:

2 * (-1)

This is because – negation is the only prefix operator in Haskell. All other operators are infix or suffix operators.

Boolean values are represented using True and False.

Everything following a -- on a line is a comment.

The comparison operators include equality (==), inequality (/=), less than (<), less than or equal to (<=), greater than (>), and greater than or equal to (>=):

a < 3 --If a is less than 3, this evaluates to True, otherwise it evaluates to False
5 /= 10 --This evaluates to True

The Boolean operators are &&, ||, and not:

True && (not False) --This evaluates to True
5 < (-1) --This evaluates to False

Infix operators are surrounded by back quotes, for example the modulo operator is an infix operator:

a `mod` 5 == 0

Integer division is performed using `div` and `quot`. `div` is truncated towards negative infinity and `quot` is truncated towards zero.

You can get the remainder from division using the `rem` operator.

Computation by evaluation and substitution

Haskell programs perform computations in a different way from imperative programs. The model for how Haskell performs computations is that evaluation happens by repeatedly substituting a sub-expression with its value. This way, no mutable variables are necessary. For example, consider the following expression:

((10 + 15) `mod` 2) * (-1)

The first step involved in computing the expression is to evaluate the sub-expression (10 + 15) and substitute its value, 25, in place:

(25 `mod` 2) * (-1)

Next, the sub-expression (25 `mod` 2) is evaluated and substituted:

1 * (-1)

Finally, the sub-expression 1 * (-1) is evaluated and substituted to give the result:

-1

All computation in Haskell, even function evaluation, recursion, and conditionals work by substituting sub-expressions. Notice that this is quite different from an imperative program, in which the program goes through a number of steps one after another to perform the computation. Every expression in Haskell can be replaced by the (one-and-only) constant value that it evaluates to.

How to make your own functions

In Haskell, functions are written with at least two statements, the type-signature of the function, followed by the definition of the function. The type-signature of a function is very important in Haskell because much of the power of Haskell comes from its impressive type system. A type-signature in Haskell consists of a function name (which must start with a lower-case letter) followed by a :: operator, followed by a set of parameter types separated by -> operators. The final type in the list of types separated by -> is the type that the function will evaluate to. For example:

addTwoMulTwo :: Int -> Int

is a function called addTwoMulTwo which takes a single integer as a parameter and evaluates to an Int.

A function definition in Haskell consists of the same function name and a list of argument names followed by an = and an expression that defines what the function evaluates to:

addTwoMulTwo a = (a + 2) * 2

The type of the function body must match the last type in the -> list because that is the type that the function will evaluate to.

Because Haskell is referentially transparent and has no side-effects by default, this function can not do anything other than using function substitution to transform its input into its output value.

In order to apply this function to a value, you simply write the function name followed by its argument list:

addTwoMulTwo 5

Let’s evaluate this function using substitution. First we look up the definition of the function and substitute the function definition for the function sub-expression, including substituting the arguments in wherever they are used in the function body:

(5 + 2) * 2

and then we continue substitution until a final value is found:

7 * 2
14

Notice that there is no return statement in addTwoMulTwo, this is because functions in Haskell don’t execute and return, they evaluate. There is no concept of a program counter in Haskell, or executing one instruction after another. There is also no call stack in the Haskell execution model because functions are not called, they are substituted.

It is possible to have nested function sub-expressions by using the value of one function as a parameter for another:

addTwoMulTwo (addTwoMulTwo 5)

Here, we must surround the function name and its parameters in parentheses to make it clear which function the parameter 5 belongs to (the inner-most function).

Pattern matching for specialization

In Haskell, it is possible to provide several function bodies which specialize a function to its parameters. In order to specialize a function using pattern matching, you put in a specific pattern of inputs, which can contain constants or other patterns (which I will explain in a later post). Here is an example function which uses pattern matching to accept a commit (evaluate to True) in a version control system if it receives a character ‘y’ or ‘Y’ as a parameter but reject it (evaluate to False) if the function receives a ‘n’ or ‘N’:

acceptCommit :: Char -> Bool
acceptCommit 'y' = True
acceptCommit 'Y' = True
acceptCommit 'n' = False
acceptCommit 'N' = False
acceptCommit _ = False

When evaluating acceptCommit, Haskell attempts to match the pattern of the input parameter from the top of the list down and when it finds a matching pattern, it performs a substitution. The _ character matches anything and acts as an “otherwise” option. Since the evaluation happens from the top down, you should make the top pattern the most specific and the bottom pattern the most general.

This may be very unusual for you if you’ve spent most of your time programming in an imperative language, but it makes it possible to write very compact functions when dealing with many possible inputs.

Functions with generic parameters

Even more strange is that Haskell lets you write functions which match generic parameters by using names which start with lower case letters in place of parameter types. For example:

doNothing :: aNothing -> aNothing
doNothing x = x

In this function, aNothing is a type variable and will pattern match against any type. Note that x is not dynamically typed, it is always possible for the compiler to determine the type of x at compile time.

The function can be used with any type, for example:

doNothing 10 --Evaluates to 10
doNothing 'a' --Evaluates to 'a'

Since all of Haskell’s built in functions impose constraints on parameters (which we’ll talk about in my next post), it is very unusual to see a function which takes only generic parameters.

Built in types

In the section above, you encountered some of the built-in types which Haskell provides.

Bool represents boolean values, which can take a value of either True or False.

Char represents a single unicode character. Char literals are written in single quotes, like ‘a’. We’ll cover String, which is a list of Chars in the next post.

To see the type of a value in the interpreter, you can type :t followed by the value:

:t 'a'
'a' :: Char

This shows us that the character literal ‘a’ has the type Char.

Haskell supports various numeric types:

Int represents a fixed precision integer, usually a 32 or 64 bit integer.
Integer represents an arbitrary precision integer.
Float represents an IEEE 754 single precision floating point number.
Double represents an IEEE 754 double precision floating point number.
Ratio Int and Ratio Integer represent fixed precision and arbitrary precision rational numbers respectively.
Complex Float and Complex Double represent single and double precision complex numbers respectively.

All Numeric types are in a type class called Num.
We’ll cover what exactly type classes are in the third tutorial post. For now, you can think of a type class as a group of types which share a property.
For example, the literal 5 has a type of Num:

:t 5
5 :: Num p => p

The p here is a place-holder for a specific type in the type class Num. Since 5 could be either an Int or an Integer, the interpreter uses p as a placeholder for these types.

Depending on what function 5 is used in later, the interpreter may decide that 5 has a type Int or Integer.

Once a type is fixed for the literal, it has that type for the rest of the program. For example, if 5 is passed to a function which expects a fixed precision Int, then it has type Int for the rest of the program after that point.

You can specify what particular type a number should have by using :: followed by the type. For example:

:t (7 :: Int)
(7 :: Int) :: Int

This specifies that 7 is a fixed precision Int, and not an arbitrary precision Integer.

All integral types are in a type class called Integral, all non-integral types are in a type class called Fractional, all rational numbers are in a type class called Rational, and all numbers which have floating point precision are in a type class called Floating.

Types which are both Real and Fractional are also in the type class RealFrac and types which are both Real and Floating are also in the type class RealFloat.

It’s important that you know that these type classes exist when you’re trying to debug your code because certain functions and operators will expect their inputs to be in a certain type class.

In practice, this means that certain arithmetic functions can take two values of a particular type and return a completely different type. For example, although 3 and 5 have Integral types, (3 / 5) has a Fractional type:

:t (3 / 5)
(3 / 5) :: Fractional a => a

If you use the + operator following this with another Integral value 6, you will get a compile error:

(3 / 5) + (6 :: Integer)

:1:2: error:
    * No instance for (Fractional Integer) arising from a use of `/'
    * In the first argument of `(+)', namely `(3 / 5)'
      In the expression: (3 / 5) + (6 :: Integer)

This error message is complaining that you are using the / operator with a value which is not a Fractional as an input. / expects two Fractional inputs and there is no (Fractional Integer) type because Integer is not in the Fractional type class.

This whole chain of type mismatches is caused because (6 :: Integer) forced the + operator to be a sum of two Integer values and then (3 / 5) would have to be an Integer.

In order to deal with arithmetic functions returning different types, you need to use type conversions.

Type conversions

Haskell includes several functions to handle type conversions:

fromIntegral converts from any Integral to any Num type.

There are special cases for converting to and from arbitrary precision Integers:
fromInteger converts from an Integer to any Num type.
toInteger converts from any Integral to an Integer.

realToFrac converts from any Real type to any Fractional type.

There are special cases for converting to and from Rational types:
fromRational converts from a Rational to any Fractional type.
toRational converts from any Real type to a Rational.

float2Double converts from a Float to a Double.
double2Float converts from a Double to a Float.

Since conversions from Real Fractional numbers to Integral numbers are inherently lossy, you have to specify how you want the rounding errors to be expressed by calling ceiling, floor, truncate, or round.

For example:

round 3.7 --This evaluates to 4

Now we can fix our previous example where we added 6 :: Integer to (3 / 5):

floor (fromIntegral 3 / fromIntegral 5) + (6 :: Integer)

Getting things done with recursion

Up to this point, it has only been possible to operate on small amounts of data and computations. In order to deal with large amounts of data or computations in Haskell you need to use recursion. Because iteration requires mutable state in the form of a counter, it is not available in Haskell.

As an example of a recursive function in Haskell, let’s consider the Ackermann function:

ackermann :: Integer -> Integer -> Integer
ackermann x 0 = ackermann (x - 1) 1
ackermann 0 y = y + 1
ackermann x y = ackermann (x - 1) (ackermann x (y - 1))

This function is recursive because the definition of ackermann refers to the ackermann function itself. Let’s consider the first few recursions of (ackermann 2 3). The results of the substitutions are listed in order next:

ackermann 2 3
ackermann (2 - 1) (ackermann 2 (3 - 1))
ackermann (2 - 1) (ackermann 2 2)

Note that here we could have chosen to evaluate (2 – 1) first and the result would still be the same because of referential transparency. This fact will become very important when we consider concurrency later. Let’s continue:

ackermann 1 (ackermann 2 2)
ackermann 1 (ackermann (2 - 1) (ackermann 2 (2 - 1))
ackermann 1 (ackermann 1 (ackermann 2 (2 - 1)))
ackermann 1 (ackermann 1 (ackermann 2 1))
ackermann 1 (ackermann 1 (ackermann (2 - 1) (ackermann 2 (1 - 1))))
ackermann 1 (ackermann 1 (ackermann 1 (ackermann 2 (1 - 1))))
ackermann 1 (ackermann 1 (ackermann 1 (ackermann 2 0)))

The inner-most sub-expression now matches the first pattern, so we substitute the first pattern next:

ackermann 1 (ackermann 1 (ackermann 1 (ackermann (2 - 1) 1))))

and so on…

What about conditionals?

We already have all of the mechanisms for performing conditional branching using pattern matching. In the Ackermann function above, we choose the appropriate body to substitute based on the value of the inputs to the function.
You can implement a simple branch on a Boolean expression by matching against True and False. The following function will only evaluate to a + 2 if the first parameter evaluates to True:

addTwoConditional :: Bool -> Int -> Int
addTwoConditional True a = a + 2
addTwoConditional False a = a

We can evaluate this function as follows to make an expression that only evaluates to a + 2 if a < 3:

addTwoConditional (a < 3) a

There is some other syntax in Haskell for performing conditional branching, but since we want to get to making side-effects as quickly as we can, we’ll cover that in a later post.

A taste of functional programming

Haskell is a functional programming language, which means that functions are types too. You can pass a function as an argument to another function just like you would pass a variable.

Let’s write a function to compute the shipping and handling cost for a package from the USA. The function will take another function as a parameter called rate. rate is used to compute what the shipping rate is for a particular country.

Here’s what the function looks like:

totalShippingAndHandlingCost :: (Float -> Float) -> Float -> Float -> Float
totalShippingAndHandlingCost rate weight handlingRate = rate (weight * handlingRate)

As you can see, the first parameter of totalShippingAndHandlingCost has the type (Float -> Float), which is the type of a function which takes a Float as a parameter and evaluates to a Float.

Now let’s create functions to use to define the shipping rate for different countries:

usaRate :: Float -> Float
usaRate weight = weight * 15.0

australiaRate :: Float -> Float
australiaRate weight = weight * 50.0

We can evaluate totalShippingAndHandlingCost with either usaRate or australiaRate as a parameter, depending on which country we’re shipping to:

totalShippingAndHandlingCost usaRate 10.0 1.10 --This evaluates to 165.0

I’ll be covering more about functional programming in a later post.

Lambdas

Finally, let’s cover lambda functions, which are also known as anonymous functions. Lambdas are functions which aren’t named and are declared on the line where we call them. They are typically used for one-off functions which are passed to other functions as parameters. For example, here’s a lambda representing a shipping rate function, similar to the functions usaRate and australiaRate which we defined previously:

\weight -> weight * 25.0

As you can see, you define a lambda by starting with a \ character, then a parameter list, a -> symbol, and finally a function body.
Although lambdas are commonly used in Haskell for functional programming, they don’t have an explicit type signature, so you should avoid using them in places where type checking is important. Since this function has no name, we can’t call it after this statement, so the code snippet above would be useless on its own.
Here’s an example of how we can use a lambda as a parameter to totalShippingAndHandlingCost:

totalShippingAndHandlingCost (\weight -> weight * 25.0) 10.0 1.10

This has the advantage that we didn’t have to write a full function definition, but it has the disadvantage that we can’t tell what the purpose of the lambda is just by looking at it.
At this point, you should be able to write any program you want that evaluates without side-effects.

Continue reading The Type Language.