Lab 5: Using Randomness

Posted on November 6, 2017

Randomness

A common requirement for computer programs is that there should be some randomness in the behaviour of the program. For this you need a random number generator, which can generate an unlimited sequence of random numbers. Typically this these numbers are not truly random, but pseudo-random.

A pseudo-random number generator generates a sequence of numbers based on an initial state $s$ using a function $f$. The states are $s_1 = f(s)$, $s_2 = f(f(s))$, $s_3 = f(f(f(s)))$, etc. Each random number is then derived from each of these intermediate states (typically by taking a small proportion of the 0’s and 1’s from the state).

The $f$ function has to be chosen so that subsequent states have no obvious correlation, so for most intents and purposes the numbers generated are random, but the sequence of numbers is in fact entirely determined by the initial state $s$.

In many programming languages, the random number generator state is hidden from the programmer, but because of the purity of Haskell, the random number generator state must explicitly be passed through your program.

The following ghci session shows how the function random from the System.Random module can be used.

If you only installed Haskell Core Platform, you will need to download the module:

$ cabal install random

Pop into GHCi.

$ ghci
Prelude> import System.Random
Prelude System.Random> :t random
random :: (Random a, RandomGen g) => g -> (a, g)
Prelude System.Random> g <- newStdGen -- create new random number generator
Prelude System.Random> let (x, g') = random g :: (Int, StdGen)

We had to specify the type above so random knows what type to generate.

Prelude System.Random> x
-4940845671478547204
Prelude System.Random> let (x, _) = random g :: (Int, StdGen)
Prelude System.Random> x
-4940845671478547204

Oops, reusing g gives us the same number again!

Prelude System.Random> let (x, g'') = random g' :: (Int, StdGen)
Prelude System.Random> x
-7259815437984538120

Much better! Using g' gives us the next number.

The State Monad Overview

The State Monad is a fairly general monad that lets you maintain state between functions without having to explicitly pass the state from one function to the next.

This is particularly useful when using random numbers in many parts of a Haskell program, as otherwise we would have to explicitly pass the random number generator in and out of every function that needed it.

There is a standard implementation of the State monad in a package called Control.Monad.State. However, that implementation is rather complex so we have provided a simple version of the State monad in the provided file State.hs.

State is a Monad, so you can use it with do notation. An instance of the state monad is a function of type s -> (a, s), where s is the type of the state, and a is the “real” return type of the function. The way State is defined, the state portion is implicitly passed between functions unless explicitly modified. The functions get and put play a special role, as they allow you to fetch and update the state. The below example illustrates how these work in a do block.

import State

type StringState a = State String a

testState :: StringState Int
testState = do
    initState <- get -- Pull out current state
    let newState = "Hello " ++ initState
    put newState -- Replace old state
    return 1 -- this will be the value that is returned along with state
             -- testState is then a value of type StringState Int.

We can’t just invoke this testState definition—it’s actually a wrapped function. (We will expound on this later.)

$ ghci StringState.hs State.hs
Ok, modules loaded: State, Main.
*Main> :t testState
testState :: StringState Int
*Main> testState

:1:0:
    No instance for (Show (StringState Int))
      arising from a use of `print' at :1:0-8
    Possible fix:
      add an instance declaration for (Show (StringState Int))
    In a stmt of an interactive GHCi command: print it

Instead, we can use runState to pull the function out of State, and then provide an initial value for the state to kick it off.

*Main> :t runState testState
runState testState :: String -> (Int, String)
*Main> runState testState "bob"
(1,"Hello bob")

Notice how we effectively get two return values from running the state monad: the actual returned value, and the final state. Make sure that you understand how the result above was obtained.

“Hiding” the State

The get and put examples above are important to understand, but recall that the lecture notes call the State monad the “elephant in the room” monad: useful for when something is so pervasive that it’s cleaner to hide it away.

Your code in today’s lab will actually only use get and put in a few places (in only one place, in fact). Most of your code will rely on the bind (>>=) function to hide the transmission of state between the parts of your program.

Lab Tasks

Pull your repository (git pull). There are several files. You will only need to modify Lab5.hs

Familiarize yourself with the definition of the function rand in RandState.hs. The function rand generates numbers in a huge range (what is it?).

You can test this function from ghci by creating a new random number generator, and then using the runRandom function. E.g.:

[~/labRand]$ ghci RandState.hs State.hs
Ok, modules loaded: RandState, State.
*RandState> gen <- newStdGen -- Create new random number generator
*RandState> runRandom rand gen :: Int
1256390094846286344
*RandState> gen <- newStdGen
*RandState> runRandom rand gen :: Double
0.5556050250436191

See RandState.hs to see how runRandom is implemented.

In Lab5.hs implement a new function randR :: Random a => (a, a) -> RandState a that is like rand, but generates numbers in a specified range. Use the function randomR :: RandomGen g => (a, a) -> g -> (a, g) in System.Random. You can copy and paste the body of rand from RandState.hs as a starting point.

Using randR, implement a function rollTwoDice which generates a random sum of two dice (so between 2 and 12, but be sure to roll the virtual dice separately to get the right distribution i.e. 7 is the most common number). Choose the appropriate type signature.

Shuffling Cards

In this section we will write code to randomly shuffle a standard deck of cards. You should see this in your starter file:

-- Data types to represent playing cards
data CardValue
    = King
    | Queen
    | Jack
    | NumberCard Int  -- From 1 to 10
    deriving (Show, Eq)

data CardSuit
    = Hearts
    | Diamonds
    | Spades
    | Clubs
    deriving (Show, Eq)

data PlayingCard =
    PlayingCard CardValue CardSuit
    deriving (Eq)

type Deck = [PlayingCard]

-- fullCardDeck is a deck of cards, 52 in total, with a King, a Queen,
-- a Jack and NumberCards from 1 to 10 for each suit.
fullCardDeck :: Deck
fullCardDeck =
    [ PlayingCard v s | v <- allVals, s <- allSuits ]
    where
        allVals  = King : Queen : Jack : [ NumberCard i | i <- [1..10] ]
        allSuits = [Hearts, Diamonds, Spades, Clubs]

Implement a function removeCard, which will be monadic. Given a list of PlayingCards, it should pick a random card out of the list, return the card and the list with the card removed (as a tuple potentially). This is a basic function that can help you shuffle a deck. A common interview question is to implement deck shuffling. A deck shuffling function should induce a uniform probability distribution over all permutations: i.e. all orderings of the cards should be equally likely. You will want to write this recursively, handling the base case for an empty deck, and then using an inductive step: using removeCard and then recursing.

Implement a function shuffleDeck which takes a list of playing cards, [PlayingCard]. Test that it returns random permutations. To write this, start by considering the case with one card. Then consider the case with two cards. Then consider the case with three cards. Try to see if you can implement the three card case by using the two card function. Test the uniformity of shuffles by looking at small decks and checking statistics.

Implement a function shuffleADeck which generates and outputs a shuffle of full deck of 52 cards. Test that it outputs that many cards.

Putting it together

At the bottom of the starter code you will see a block of commented out code. Uncomment the code and then implement shuffleNTimes and rollTwoDiceNTimes so that your final program can be run from the command line like this:

$ ./Lab5 shuffle 3
[6♠,4♠,7♠,5♣,2♣,K♣,K♦,K♠,Q♥,9♥,5♥,9♣,7♣,4♦,8♣,4♣,6♦,5♠,5♦,10♠,10♦,8♦,10♣,7♥,1♦,J♥,6♣,2♥,J♦,7♦,Q♦,8♠,9♠,3♣,1♥,J♣,9♦,3♥,4♥,2♦,Q♠,1♠,K♥,2♠,Q♣,J♠,3♠,8♥,1♣,10♥,3♦,6♥]
[J♣,1♠,3♦,3♠,6♣,5♣,6♥,1♣,K♥,9♦,J♦,10♦,6♠,J♠,5♦,8♠,J♥,7♠,3♥,9♠,2♥,4♦,8♣,4♠,K♣,1♦,K♠,Q♦,2♠,4♥,10♥,10♠,6♦,9♣,10♣,3♣,Q♠,2♣,7♦,5♠,K♦,7♥,Q♣,1♥,7♣,4♣,9♥,8♦,8♥,5♥,Q♥,2♦]
[Q♦,3♣,2♥,9♠,Q♣,5♦,7♦,5♠,6♠,2♣,5♥,9♥,6♦,6♣,1♠,6♥,10♦,1♦,3♦,2♠,5♣,7♣,3♠,9♣,1♣,4♠,Q♥,8♦,Q♠,7♠,10♠,8♠,4♦,J♥,K♥,8♣,4♣,J♦,J♠,K♠,2♦,9♦,K♦,7♥,10♣,K♣,8♥,1♥,J♣,10♥,3♥,4♥]

$ ./Lab5 rollTwoDice 5
6
6
4
8
7

Make sure your lab behaves this way—it is what you will be graded on. Make sure each deck shuffle list appears on its own line exactly as the above.

Note that each iteration above should start with the random number generator produced from the last iteration. For deck shuffling, each iteration should shuffle an ordered deck (not the deck from the previous iteration).

Constraints on Your Submission

The goal of the lab is to use the State monad to “automatically” thread the random number generators through your code. So, pretty much every function that uses random numbers should be part of the State monad (i.e. the function returns some kind of RandState). In the State monad, the bind (>>=) function will handling transmitting the state behind the scenes.

To figure out if a function needs to be in the RandState monad, ask yourself, “If I call this function twice in sequence, do I want different results the second time?” If yes, the function needs to be monadic so the generator is properly transmitted from one call to the next.

Also, the function itself should be used in a monadic context so calls to it are properly glued together with bind (>>=).

Thus, most of your code should be in the RandState monad. And it should rely on the monad so you do as little explicit management of the state as possible.

Therefore:

Your lab should only call newStdGen once. (And it’s already in the starter code.)
Your code should only use get and put once each.
Your code should only call runState (or evalState or execState or runRandom*) twice: once in shuffleNTimes and once in rollTwoDiceNTimes (or once in a helper of each function). runRandom is the interface between the IO monad and the RandState monad. Look at the provided StateExamples.hs file for a good example of using this interface.

If you find yourself using runState or runRandom in every function, or are unpacking the generator in every random function, you are defeating the purpose of the State monad and will lose points. Handling the transmission of state is bind’s job. See the tutorial in the next section.

*Instructor hint: in the shuffleNTimes and rollTwoDiceNTimes functions you will probably want runState because runRandom or evalState discard the state (the generator), but you will want both the result and the new generator.

Where is the Generator?

What is the type of the provided rand function?

rand = do
    gen <- get
    let (x, gen') = random gen
    put gen'
    return x

Well, we need to somehow get an initial generator into this function, so maybe one of the parameters is a generator.

But as written, this function has no named arguments. And, if you look at the provided code, the type annotation is essentially:

rand :: RandState Int

This function only returns something? How does a generator get into this function? Why isn’t a generator one of the arguments? What’s going on?

The answer has to do with how Haskell represents state.

You might think that, to represent state, Haskell might have some datatype with some fields that could be modified as the program executed. However, Haskell adheres to the discipline of purity: nothing can change once declared.

Haskell takes a stranger route:

newtype State s a = State {
    runState :: s -> (a,s)
}

Which is essentially:

newtype State s a = State (s -> (a,s))

So State is just a wrapper for an s -> (a,s) function—a function that takes an initial state and returns a value and a new state.

Haskell doesn’t really represent state at all—it represents transformations on state.

Back to our rand function, we said its type signature was essentially

rand :: RandState Int

From RandState.hs we see type RandState a = State StdGen a, so we can de-sugar this:

rand :: State StdGen Int

So the rand function will return some value that looks like:

State (\gen -> ...some code that returns (int, gen') )

Aha, there’s the missing generator! That’s why the generator is not an argument to the function: the rand function isn’t actually doing any computation, it’s just building a state transforming function. rand returns a wrapped function that is waiting for an initial generator.

So how does all this:

rand :: RandState Int
rand = do
    gen <- get
    let (x, gen') = random gen
    put gen'
    return x

return a wrapped s -> (a,s) function?

Remember, do notation is de-sugared to calls to bind (>>=). All monad magic is in bind. The State monad’s bind definition is hard to follow, but you should at least convince yourself that it is “secretly” handling the state for you, setting up a function that sequences the actions:

ma >>= f = State $ \s ->
    let (a,t) = runState ma s
        mb = f a
        (b,u) = runState mb t
    in (b,u)

and you should notice that, indeed, it returns a value of the form State (s -> (a, s)).

So bind composes our State (s -> (a,s)) actions into more complicated State (s -> (a, s)) actions. The basic returned function is always s -> (a,s), a function waiting for that initial state before it will actually do all the computation.

So how might be use all this? How would we write, say, a function that returns two random numbers in a tuple?

randTwo = ...

The first question is: what is the type signature?

If we write the function in sequence, should it return different values? It’s a random number function, so, yes, it should. This means it needs different generators each time which means it should be monadic. We’ll let the RandState monad hide the transmission of the generators.

randTwo :: ... -> RandState (...)
randTwo = ...

What should be the return value in the monad? Well, just pair of numbers.

randTwo :: ... -> RandState (Int, Int)
randTwo = ...

Do we need to pass any arguments to randTwo? Well, the initial generator maybe, but wait…no. We’re returning RandState (Int, Int) which is
State StdGen (Int, Int) which represents values that look like
State (\gen -> ...some code that returns ((Int, Int), gen) ).

Our randTwo will return a wrapped function that’s waiting for a generator. We don’t need an explicit generator in our type declaration—it’s part of RandState. We’re just building up a computation to be performed later.

randTwo :: RandState (Int, Int)
randTwo = ...

Okay, what’s next? Well, we’re in the RandState monad because we want our state transformations to “automatically” be composed properly by bind. So this function should be monadic, let’s start by writing do.

randTwo :: RandState (Int, Int)
randTwo = do
    ...

Now let’s get our random numbers.

randTwo :: RandState (Int, Int)
randTwo = do
    let r1 = randR
    let r2 = randR

Hmm. This seems not to work. We want an Int each time but randR is a RandState Int. Maybe we need to use runRandom to actually execute randR’s state transformer.

randTwo :: RandState (Int, Int)
randTwo = do
    let r1 = runRandom randR gen
    let r2 = runRandom randR gen

Oh, now we need a generator. Maybe we can use newStdGen, wait, no, that’s an IO monadic function and we’re in a different kind of monad. Oh, maybe we can pull the generator out of the state!

randTwo :: RandState (Int, Int)
randTwo = do
    gen <- get
    let r1 = runRandom randR gen
    let r2 = runRandom randR gen

Oh shoot. It’s using the same generator both times—we’re going to get the same numbers for r1 and r2…hmmm….

STOP STOP STOP!

Too much work. We’re doing bind’s job. The whole point of the monad was so that bind could take care of sequencing the generators for us. Make the code fully monadic. Let the do notation glue together our actions with bind.

randTwo :: RandState (Int, Int)
randTwo = do
    r1 <- randR
    r2 <- randR

Now, behind the scenes, bind will take care of grabbing the returned generator from the first call and it using as the initial generator for the second call. A good exercise is to de-sugar the do notation here to bind calls and think about how the bind function above does what we want it to.

Last thing we must do is return the pair of random numbers.

randTwo :: RandState (Int, Int)
randTwo = do
    r1 <- randR
    r2 <- randR
    (r1, r2)

Whoops, this doesn’t typecheck. We’re returning (Int, Int) but we need to return RandState (Int, Int). How do wrap a vanilla value in a monad? The return function:

randTwo :: RandState (Int, Int)
randTwo = do
    r1 <- randR
    r2 <- randR
    return (r1, r2)

There we go.

Most of your code should be similarly simple. You should only explicitly handle generators in randR, shuffleNTimes, and rollTwoDiceNTimes.

To Hand In

Your lab5/ directory should have Lab5.hs, as well as the un-modified starter files RandState.hs and State.hs. Be sure to delete StateExamples.hs.

You do not need to write tests for this week’s lab.

As usual, use Git to submit, and check Gitlab to confirm that your changes are there. Whatever you have pushed at the deadline is your submission.

Grading Breakdown

10% Lab5 compiles. ($ ghc Lab5.hs does not error.)
10% Clean code.
- Parentheses in the correct place: sqrt (5 * 2) not sqrt(5 * 2).
- Type definitions on all top-level functions.
- Understandable names. Naming is one of the two hard problems in computer science. Short names are fine for common uses: i for index, x:xs for generic array elements, n for generic number, c for counter, etc. Abbreviations are fine too, but the rule is a reader should know the meaning of the thing just by reading its name.
- Type definitions on all top-level functions.
- No commented-out code.
- Avoid too many long lines (>100 characters). A couple long lines are okay.
- Parentheses in the correct place: sqrt (5 * 2) not sqrt(5 * 2).
- Avoid unnecessary code, e.g. pattern match cases that could be combined, or deconstructing a value and then just putting it back together.
- No long, hard-to-follow, or deeply nested functions that should be broken up. Take advantage of where and let.
- Comments explaining any hard-to-understand logic, e.g. what does this helper function do?
- Tasteful use of whitespace for vertical alignment.
- η-reduction is not necessary, but can be tasteful.
10% newStdGen only used once.
10% Code properly monadic: runRandom or equivalent only used at the monad boundary between IO and RandState (e.g. in shuffleNTimes and rollTwoDiceNTimes). Similarly, the the bind function of RandState is used to thread the generator through (except perhaps in shuffleNTimes and rollTwoDiceNTimes). get and put used only once.
10% $ ./Lab5 shuffle N outputs N results (Your lab should be able to output 10,000 shuffles in less than 30 seconds.)
05% $ ./Lab5 shuffle N outputs one result per line
15% $ ./Lab5 shuffle N is properly random
10% $ ./Lab5 rollTwoDice N outputs N results (Your lab should be able to output 10,000 rolls in less than 30 seconds.)
05% $ ./Lab5 rollTwoDice N outputs one result per line
15% $ ./Lab5 rollTwoDice N is properly random

Extra Credit: Approximating Pi (8%)

We will shift gears a bit in this final part of the lab to show another application of randomness. Randomness can be a powerful tool if we want to work out approximate solutions to problems. There is a general class of computational methods called Monte-Carlo methods that are widely used.

One simple example of a problem that can be solved with a Monte-Carlo method is calculating the approximate value of π.

Consider a circle inscribed in a square. If the circle has radius $r = 1$, then the side length $l$ of the square is two, and the the ratio of the area of the circle to area of the square is $\frac{\pi r^2}{l^2} = \frac{\pi 1^2}{2^2} = \frac{\pi}{4}$. If we then choose points at random in the square (i.e. with x and y values in the interval $[-1, 1]$), then we can approximate π by checking which proportion fall within the circle. The more points are randomly chosen, the closer the ratio will converge to π/4. You can easily check to see if the point is within the circle—less than 1 unit away from (0,0).

Here is a template for the pi approximation code:

-- Succeed if randomly chosen point from square is inside circumscribed circle
piTrial :: RandState Bool

-- Perform n trials of the RandState function provided as the second argument,
--  and give back the number of successful trials
-- Hint: To perform the n trials, you can either use sequence from
--       Control.Monad, or you can use recursion
bernoulliTrials :: Int -> RandState Bool -> RandState Int

-- Approximate pi using n randomly chosen points
-- Hint: You will probably need to use the fromIntegral function to
--       convert Int into Double.
approxPi :: Int -> RandState Double

Make a new file ApproxPi.hs that runs on the command line, taking the number of number of iterations as an argument, and outputs an approximation of π.

$ ./ApproxPi 10000
3.1588