Making a Text Adventure in Haskell (Part 2)

Consuming Tokens

Last time in Making a Text Adventure in Haskell (Part 1) I explained how to generate a list of tokens from a sentence by extracting important words that we’re interested in using look-ahead. This time, I’ll cover how I made sense of sentences by consuming the list of tokens which the lexer produced.

Pattern Matching Sentences

Most parsers are based on a data structure called an Abstract Syntax Tree (AST). In the case of this text adventure, the grammar that I decided on is so simple that I don’t even need to build one. I can simply pattern match the kinds of sentences that I want to accept. Although this is somewhat limiting, it can handle most of the kinds of sentences that a player will use to interact with the game.
Here’s the data structure I used to store sentences:

data Sentence = Phrase Token |
                SimpleSentence Token Token |
                SimplePrepositionSentence Token Token Token |
                ComplexSentence Token Token Token Token |
                ComplexPrepositionSentence Token Token Token Token Token deriving (Show, Eq)

As you can see, there are five kinds of sentences I want to support. Let’s look at the parsing function and then we’ll drill down to how each type of sentence is created.

Here is the parseSentence function, which takes the TokenMatch list as an input and evaluates to list of Sentences:

parseSentence :: [TokenMatch] -> [Sentence]
parseSentence [(TokenMatch _ t0), (TokenMatch _ t1), (TokenMatch _ t2), (TokenMatch _ t3), (TokenMatch _ t4)]
    = makeSentence [(verbsInTokenList t0),
                    (prepositionsInTokenList t1),
                    (nounsInTokenList t2),
                    (prepositionsInTokenList t3),
                    (nounsInTokenList t4)]
parseSentence [(TokenMatch _ t0), (TokenMatch _ t1), (TokenMatch _ t2), (TokenMatch _ t3)]
    = makeSentence [(verbsInTokenList t0),
                    (nounsInTokenList t1),
                    (prepositionsInTokenList t2),
                    (nounsInTokenList t3)]
parseSentence [(TokenMatch _ t0), (TokenMatch _ t1), (TokenMatch _ t2)]
    = makeSentence [(verbsInTokenList t0),
                    (prepositionsInTokenList t1),
                    (nounsInTokenList t2)]
parseSentence [(TokenMatch _ t0), (TokenMatch _ t1)]
    = makeSentence [(verbsInTokenList t0),
                    (nounsInTokenList t1)]
parseSentence [(TokenMatch _ t0)]
    = makeSentence [(verbsInTokenList t0)]
parseSentence _ = []

The parseSentence function is arranged to match the more complex sentence structures first and pattern matching proceeds down to simpler sentences. Let’s look at the second last non-empty case, which parses the simplest type of sentence:

parseSentence [(TokenMatch _ t0), (TokenMatch _ t1)]
    = makeSentence [(verbsInTokenList t0),
                    (nounsInTokenList t1)]

The pattern takes a list containing exactly two TokenMatches and evaluates to a list of SimpleSentence. Remember that t0 and t1 here are lists of Tokens. Since I assume that the subject of the sentence is always the player, the only things we need to match for a simple sentence are the verb and object, which are found using (verbsInTokenList t0) and (nounsInTokenList t1). verbsInTokenList and nounsInTokenList can match multiple tokens, so they also return lists of Token. These are then passed into makeSentence which builds the list of SimpleSentence.
All other parseSentence patterns are structured in the same way.

Let’s look at verbsInTokenList which tries to find a list of verbs in the Token list:

verbsInTokenList :: [Token] -> [Token]
verbsInTokenList [] = []
verbsInTokenList ((TokenVerb synonyms) : ts) = (TokenVerb synonyms) : verbsInTokenList ts
verbsInTokenList (_ : ts) = verbsInTokenList ts

verbsInTokenList takes a list of Tokens and evaluates to a list of Tokens. 

The function recursively processes the list and when it pattern matches a TokenVerb, it evaluates to (TokenVerb synonyms) and continues the recursion to search for more verbs to add to the list. It ignores any other token type.

verbsInTokenList outputs a list because multiple verbs may match the word in the string.

nounsInTokenList and prepositionsInTokenList work the same way.

So we have a list of verbs and a list of nouns. These are then passed to makeSentence:

makeSentence :: [[Token]] -> [Sentence]
makeSentence []
    = []
makeSentence [verbs]
    = fmap (\verb -> Phrase verb) verbs
makeSentence [verbs, nouns]
    = fmap (\[verb, noun] -> SimpleSentence verb noun) 
      ((:) <$> verbs <*>
           ((:) <$> nouns <*> [[]]))
makeSentence [verbs, prepositions, nouns]
    = fmap (\[verb, preposition, noun] -> SimplePrepositionSentence verb preposition noun)
      ((:) <$> verbs <*>
          ((:) <$> prepositions <*>
              ((:) <$> nouns <*> [[]])))
makeSentence [verbs, nouns0, prepositions, nouns1]
    = fmap (\[verb, noun0, preposition, noun1] -> ComplexSentence verb noun0 preposition noun1)
      ((:) <$> verbs <*>
          ((:) <$> nouns0 <*>
              ((:) <$> prepositions <*>
                  ((:) <$> nouns1 <*> [[]]))))
makeSentence [verbs, prepositions0, nouns0, prepositions1, nouns1]
    = fmap (\[verb, preposition0, noun0, preposition1, noun1] -> ComplexPrepositionSentence verb preposition0 noun0 preposition1 noun1)
      ((:) <$> verbs <*>
          ((:) <$> prepositions0 <*>
              ((:) <$> nouns0 <*>
                  ((:) <$> prepositions1 <*>
                      ((:) <$> nouns1 <*> [[]])))))

This is the most complicated function we’ve seen so far, so let’s work from the most simple pattern to the more complex patterns.

First note that makeSentence takes a list of lists of Tokens and evaluates to a list of Sentences. This is because multiple combinations of nouns, verbs, and prepositions may be valid matches for the sentence.

For example, the sentence “I look over the edge” matches the following tokens:

(TokenVerb ["look"]) (TokenPreposition ["over", "across"]) (TokenNoun ["edge"])

but it also matches the following tokens:

(TokenVerb ["look"]) (TokenPreposition ["above", "over"]) (TokenNoun ["edge"])

The difference here is subtle, but in the first match, “over” is used in the sense of “expressing passage or trajectory across”, whereas in the second match, “over” is used in the sense of “extending directly upward from” and is a synonym for “above”.
Both matches are valid for the purpose of parsing; these two meanings are syntactically identical, even though they have different semantic meanings.

All combinations of Tokens which match the syntax are valid at this stage of parsing.

We’ll discuss how to deal with this combinatorial explosion of Tokens in Haskell soon, but let’s look at the first non-empty makeSentence first:

makeSentence [verbs] = fmap (\verb -> Phrase verb) verbs

fmap is the generic map function and it takes two parameters, the first of which is a function and the second of which is a Functor. A Functor is just something which can be mapped over. (Put another way, a Functor is something which you can call fmap on).

It turns out that lists are Functors. When you call fmap on a list, it returns another list whose elements are the result of applying the function to each element of the input list. Let’s look at what might happen if we applied makeSentence to this list of verbs:

[(TokenVerb ["pass out", "faint"]), (TokenVerb ["pass out", "distribute"])]

Notice that both TokenVerbs are a valid match for the phrasal verb “pass out”.

fmap (\verb -> Phrase verb) [(TokenVerb ["pass out", "faint"]), (TokenVerb ["pass out", "distribute"])]

This would be converted into:

[(\verb -> Phrase verb) (TokenVerb ["pass out", "faint"]),
 (\verb -> Phrase verb) (TokenVerb ["pass out", "distribute"])]

After applying the lambdas, this would be:

[Phrase (TokenVerb ["pass out", "faint"]), Phrase (TokenVerb ["pass out", "distribute"])]

which is the list of sentences we want.

Now let’s consider the next pattern in makeSentence:

makeSentence [verbs, nouns]
    = fmap (\[verb, noun] -> SimpleSentence verb noun) 
      ((:) <$> verbs <*> ((:) <$> nouns <*> [[]]))

There is some unusual syntax involving the (:) list concatenation operator, <$>, and <*>. This syntax is related to a type class called Applicative, which represents a category of types called applicative functors. We’ll go into this in more detail in my Haskell tutorial later, but for now, let’s look at how this syntax helps us compute sentences.
First, let’s look at the lambda in this pattern of makeSentence and see what’s different about it:

(\[verb, noun] -> SimpleSentence verb noun)

Notice that the lambda takes a list as input, instead of a single value as it did in the Phrase pattern of makeSentence.

Neither the list verbs nor the list nouns contain lists of Tokens as elements (they both have the type [Token]). Also, fmap only takes one list as an input, not two. From these two facts,  you can infer from this that I’ve taken the two constants, verbs and nouns, of the type [Token] and combined them into a single constant of the type [[Token]]; where each element in the single constant is a list with two elements, the verb followed by the noun.

The expression which performs this combination is:

((:) <$> verbs <*> ((:) <$> nouns <*> [[]]))

<$> and <*> are sometimes called “apply”. To understand this, let’s look at the inner-most nested sub-expression:

(:) <$> nouns <*> [[]]

This code takes each combination of elements in the first list, noun, and combines it with each combination of elements in of the second list, [[]], using the operator (:).

Let’s look at a worked example of <$> and <*> using two lists of Ints:

(+) <$> [1, 2] <*> [3, 4]

This evaluates to:

[(+) 1, (+) 2] <*> [3, 4]

Notice that (+) has been distributed to each element in [1, 2]. By evaluating <$>, we have applied (+) to each element of [1, 2]. In general, <$> applies something which is not Applicative to something which is Applicative.

Let’s continue evaluating:

[(+) 1 3, (+) 1 4, (+) 2 3, (+) 2 4]
    == [1 + 3, 1 + 4, 2 + 3, 2 + 4]

Now we’ve distributed all of the elements in [(+) 1, (+) 2] into all elements of [3, 4] with all possible combinations. By evaluating <*>, we have applied each element of [(+) 1, (+) 2] to each element of [3, 4]. In general, <*> applies something which is Applicative to something else which is Applicative.

Another way of looking at this is that the resulting list contains all possible results of combining each element in the first list with each element in the second list.

We can use this Applicative list behavior to create the list of lists of all possible combinations of verbs and nouns!

Let’s go back to the previous expression:

(:) <$> nouns <*> [[]]

This joins each noun with the [] empty list using the (:) operator.

It’s impossible for two nouns to match a single word in the game, but for the sake of creating an example, let’s suppose we had two nouns, “tree” and “cat”, in the list:

(:) <$> [TokenNoun "tree", TokenNoun "cat"] <*> [[]]

[(:) (TokenNoun "tree"), (:) (TokenNoun "cat")] <*> [[]]

[(:) (TokenNoun "tree") [], (:) (TokenNoun "cat") []]

[(TokenNoun "tree") : [], (TokenNoun "cat") : []]

[[TokenNoun "tree"], [TokenNoun "cat"]]

Now we have a list of lists of Tokens, but there is still only a noun in the sub-list. We need to add the verbs in to the sub-lists and we’ll use the Applicative list behavior to do so.

Let’s continue our example using the two “pass out” verbs we defined previously:

(:) <$> verbs <*> ((:) <$> nouns <*> [[]])

(:) <$> verbs <*> [[TokenNoun "tree"], [TokenNoun "cat"]]

(:) <$> [TokenVerb ["pass out", "faint"], TokenVerb ["pass out", "distribute"]] <*> [[TokenNoun "tree"], [TokenNoun "cat"]]

[(:) (TokenVerb ["pass out", "faint"]), (:) (TokenVerb ["pass out", "distribute"])] <*> [[TokenNoun "tree"], [TokenNoun "cat"]]

[(:) (TokenVerb ["pass out", "faint"]) [TokenNoun "tree"],
 (:) (TokenVerb ["pass out", "faint"]) [TokenNoun "cat"],
 (:) (TokenVerb ["pass out", "distribute"]) [TokenNoun "tree"],
 (:) (TokenVerb ["pass out", "distribute"]) [TokenNoun "cat"]]

[(TokenVerb ["pass out", "faint"]) : [TokenNoun "tree"],
 (TokenVerb ["pass out", "faint"]) : [TokenNoun "cat"],
 (TokenVerb ["pass out", "distribute"]) : [TokenNoun "tree"],
 (TokenVerb ["pass out", "distribute"]) : [TokenNoun "cat"]]

[[TokenVerb ["pass out", "faint"], TokenNoun "tree"],
 [TokenVerb ["pass out", "faint"], TokenNoun "cat"],
 [TokenVerb ["pass out", "distribute"], TokenNoun "tree"],
 [TokenVerb ["pass out", "distribute"], TokenNoun "cat"]]

The result has type [[Token]], which exactly what the fmap function takes as an input!

In effect, the list Applicative operations enable you to work with combinatorial operations using a very small amount of code. This takes some getting used to; but once you wrap your head around the concept, you can perform a lot of work with a few lines of Haskell.

Let’s look at what happens when you apply the fmap to this:

fmap (\[verb, noun] -> SimpleSentence verb noun) ((:) <$> verbs <*> ((:) <$> nouns <*> [[]]))

fmap (\[verb, noun] -> SimpleSentence verb noun)
    [TokenVerb ["pass out", "faint"], TokenNoun "tree"],
    [TokenVerb ["pass out", "faint"], TokenNoun "cat"],
    [TokenVerb ["pass out", "distribute"], TokenNoun "tree"],
    [TokenVerb ["pass out", "distribute"], TokenNoun "cat"]]

[(\[verb, noun] -> SimpleSentence verb noun) [TokenVerb ["pass out", "faint"], TokenNoun "tree"],
 (\[verb, noun] -> SimpleSentence verb noun) [TokenVerb ["pass out", "faint"], TokenNoun "cat"],
 (\[verb, noun] -> SimpleSentence verb noun) [TokenVerb ["pass out", "distribute"], TokenNoun "tree"],
 (\[verb, noun] -> SimpleSentence verb noun) [TokenVerb ["pass out", "distribute"], TokenNoun "cat"]]

[SimpleSentence (TokenVerb ["pass out", "faint"]) (TokenNoun "tree"),
 SimpleSentence (TokenVerb ["pass out", "faint"]) (TokenNoun "cat"),
 SimpleSentence (TokenVerb ["pass out", "distribute"]) (TokenNoun "tree"),
 SimpleSentence (TokenVerb ["pass out", "distribute"]) (TokenNoun "cat")]

The other makeSentence functions work using the same principle.

Once the sentence has been parsed it’s returned to the text adventure, where it will be fed into a narrative graph, allowing the user to interact with the world. We’ll cover the narrative graph in my next blog post, so stay tuned!

The code for the text adventure is available at https://github.com/WhatTheFunctional/HaskellAdventure.

Continue reading: Making a Text Adventure in Haskell (Part 3).

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s