Lab 5: Using Randomness
Randomness
A common requirement for computer programs is that there should be some randomness in the behaviour of the program. For this you need a random number generator, which can generate an unlimited sequence of random numbers. Typically this these numbers are not truly random, but pseudo-random.
A pseudo-random number generator generates a sequence of numbers based on an initial state \(s\) using a function \(f\). The states are \(s_1 = f(s)\), \(s_2 = f(f(s))\), \(s_3 = f(f(f(s)))\), etc. Each random number is then derived from each of these intermediate states (typically by taking a small proportion of the 0’s and 1’s from the state).
The \(f\) function has to be chosen so that subsequent states have no obvious correlation, so for most intents and purposes the numbers generated are random, but the sequence of numbers is in fact entirely determined by the initial state \(s\).
In many programming languages, the random number generator state is hidden from the programmer, but because of the purity of Haskell, the random number generator state must explicitly be passed through your program.
The following ghci
session shows how the function random from the System.Random
module can be used.
If you only installed Haskell Core Platform, you will need to download the module:
$ cabal install random
Pop into GHCi.
$ ghci
Prelude> import System.Random
Prelude System.Random> :t random
random :: (Random a, RandomGen g) => g -> (a, g)
Prelude System.Random> g <- newStdGen -- create new random number generator
Prelude System.Random> let (x, g') = random g :: (Int, StdGen)
We had to specify the type above so random knows what type to generate.
Prelude System.Random> x
-4940845671478547204
Prelude System.Random> let (x, _) = random g :: (Int, StdGen)
Prelude System.Random> x
-4940845671478547204
Oops, reusing g
gives us the same number again!
Prelude System.Random> let (x, g'') = random g' :: (Int, StdGen)
Prelude System.Random> x
-7259815437984538120
Much better! Using g'
gives us the next number.
The State Monad Overview
The State
Monad is a fairly general monad that lets you maintain state between functions without having to explicitly pass the state from one function to the next.
This is particularly useful when using random numbers in many parts of a Haskell program, as otherwise we would have to explicitly pass the random number generator in and out of every function that needed it.
There is a standard implementation of the State
monad in a package called Control.Monad.State
. However, that implementation is rather complex so we have provided a simple version of the State
monad in the provided file State.hs
.
State
is a Monad, so you can use it with do
notation. An instance of the state monad is a function of type s -> (a, s)
, where s
is the type of the state, and a
is the “real” return type of the function. The way State
is defined, the state portion is implicitly passed between functions unless explicitly modified. The functions get
and put
play a special role, as they allow you to fetch and update the state. The below example illustrates how these work in a do
block.
import State
type StringState a = State String a
testState :: StringState Int
testState = do
initState <- get -- Pull out current state
let newState = "Hello " ++ initState
put newState -- Replace old state
return 1 -- this will be the value that is returned along with state
-- testState is then a value of type StringState Int.
We can’t just invoke this testState
definition—it’s actually a wrapped function. (We will expound on this later.)
$ ghci StringState.hs State.hs
Ok, modules loaded: State, Main.
*Main> :t testState
testState :: StringState Int
*Main> testState
:1:0:
No instance for (Show (StringState Int))
arising from a use of `print' at :1:0-8
Possible fix:
add an instance declaration for (Show (StringState Int))
In a stmt of an interactive GHCi command: print it
Instead, we can use runState
to pull the function out of State
, and then provide an initial value for the state to kick it off.
*Main> :t runState testState
runState testState :: String -> (Int, String)
*Main> runState testState "bob"
(1,"Hello bob")
Notice how we effectively get two return values from running the state monad: the actual returned value, and the final state. Make sure that you understand how the result above was obtained.
“Hiding” the State
The get
and put
examples above are important to understand, but recall that the lecture notes call the State
monad the “elephant in the room” monad: useful for when something is so pervasive that it’s cleaner to hide it away.
Your code in today’s lab will actually only use get
and put
in a few places (in only one place, in fact). Most of your code will rely on the bind
(>>=
) function to hide the transmission of state between the parts of your program.
Lab Tasks
Pull your repository (git pull
). There are several files. You will only need to modify Lab5.hs
Familiarize yourself with the definition of the function rand
in RandState.hs
. The function rand
generates numbers in a huge range (what is it?).
You can test this function from ghci
by creating a new random number generator, and then using the runRandom
function. E.g.:
[~/labRand]$ ghci RandState.hs State.hs
Ok, modules loaded: RandState, State.
*RandState> gen <- newStdGen -- Create new random number generator
*RandState> runRandom rand gen :: Int
1256390094846286344
*RandState> gen <- newStdGen
*RandState> runRandom rand gen :: Double
0.5556050250436191
See RandState.hs
to see how runRandom
is implemented.
In Lab5.hs implement a new function randR :: Random a => (a, a) -> RandState a
that is like rand, but generates numbers in a specified range. Use the function randomR :: RandomGen g => (a, a) -> g -> (a, g)
in System.Random
. You can copy and paste the body of rand
from RandState.hs
as a starting point.
Using randR
, implement a function rollTwoDice
which generates a random sum of two dice (so between 2 and 12, but be sure to roll the virtual dice separately to get the right distribution i.e. 7 is the most common number). Choose the appropriate type signature.
Shuffling Cards
In this section we will write code to randomly shuffle a standard deck of cards. You should see this in your starter file:
-- Data types to represent playing cards
data CardValue
= King
| Queen
| Jack
| NumberCard Int -- From 1 to 10
deriving (Show, Eq)
data CardSuit
= Hearts
| Diamonds
| Spades
| Clubs
deriving (Show, Eq)
data PlayingCard =
PlayingCard CardValue CardSuit
deriving (Eq)
type Deck = [PlayingCard]
-- fullCardDeck is a deck of cards, 52 in total, with a King, a Queen,
-- a Jack and NumberCards from 1 to 10 for each suit.
fullCardDeck :: Deck
fullCardDeck =
[ PlayingCard v s | v <- allVals, s <- allSuits ]
where
allVals = King : Queen : Jack : [ NumberCard i | i <- [1..10] ]
allSuits = [Hearts, Diamonds, Spades, Clubs]
Implement a function removeCard
, which will be monadic. Given a list of PlayingCard
s, it should pick a random card out of the list, return the card and the list with the card removed (as a tuple potentially). This is a basic function that can help you shuffle a deck. A common interview question is to implement deck shuffling. A deck shuffling function should induce a uniform probability distribution over all permutations: i.e. all orderings of the cards should be equally likely. You will want to write this recursively, handling the base case for an empty deck, and then using an inductive step: using removeCard
and then recursing.
Implement a function shuffleDeck
which takes a list of playing cards, [PlayingCard]
. Test that it returns random permutations. To write this, start by considering the case with one card. Then consider the case with two cards. Then consider the case with three cards. Try to see if you can implement the three card case by using the two card function. Test the uniformity of shuffles by looking at small decks and checking statistics.
Implement a function shuffleADeck
which generates and outputs a shuffle of full deck of 52 cards. Test that it outputs that many cards.
Putting it together
At the bottom of the starter code you will see a block of commented out code. Uncomment the code and then implement shuffleNTimes
and rollTwoDiceNTimes
so that your final program can be run from the command line like this:
$ ./Lab5 shuffle 3
[6♠,4♠,7♠,5♣,2♣,K♣,K♦,K♠,Q♥,9♥,5♥,9♣,7♣,4♦,8♣,4♣,6♦,5♠,5♦,10♠,10♦,8♦,10♣,7♥,1♦,J♥,6♣,2♥,J♦,7♦,Q♦,8♠,9♠,3♣,1♥,J♣,9♦,3♥,4♥,2♦,Q♠,1♠,K♥,2♠,Q♣,J♠,3♠,8♥,1♣,10♥,3♦,6♥]
[J♣,1♠,3♦,3♠,6♣,5♣,6♥,1♣,K♥,9♦,J♦,10♦,6♠,J♠,5♦,8♠,J♥,7♠,3♥,9♠,2♥,4♦,8♣,4♠,K♣,1♦,K♠,Q♦,2♠,4♥,10♥,10♠,6♦,9♣,10♣,3♣,Q♠,2♣,7♦,5♠,K♦,7♥,Q♣,1♥,7♣,4♣,9♥,8♦,8♥,5♥,Q♥,2♦]
[Q♦,3♣,2♥,9♠,Q♣,5♦,7♦,5♠,6♠,2♣,5♥,9♥,6♦,6♣,1♠,6♥,10♦,1♦,3♦,2♠,5♣,7♣,3♠,9♣,1♣,4♠,Q♥,8♦,Q♠,7♠,10♠,8♠,4♦,J♥,K♥,8♣,4♣,J♦,J♠,K♠,2♦,9♦,K♦,7♥,10♣,K♣,8♥,1♥,J♣,10♥,3♥,4♥]
$ ./Lab5 rollTwoDice 5
6
6
4
8
7
Make sure your lab behaves this way—it is what you will be graded on. Make sure each deck shuffle list appears on its own line exactly as the above.
Note that each iteration above should start with the random number generator produced from the last iteration. For deck shuffling, each iteration should shuffle an ordered deck (not the deck from the previous iteration).
Constraints on Your Submission
The goal of the lab is to use the State
monad to “automatically” thread the random number generators through your code. So, pretty much every function that uses random numbers should be part of the State
monad (i.e. the function returns some kind of RandState
). In the State
monad, the bind
(>>=
) function will handling transmitting the state behind the scenes.
To figure out if a function needs to be in the RandState
monad, ask yourself, “If I call this function twice in sequence, do I want different results the second time?” If yes, the function needs to be monadic so the generator is properly transmitted from one call to the next.
Also, the function itself should be used in a monadic context so calls to it are properly glued together with bind
(>>=
).
Thus, most of your code should be in the RandState
monad. And it should rely on the monad so you do as little explicit management of the state as possible.
Therefore:
Your lab should only call
newStdGen
once. (And it’s already in the starter code.)Your code should only use
get
andput
once each.Your code should only call
runState
(orevalState
orexecState
orrunRandom
*) twice: once inshuffleNTimes
and once inrollTwoDiceNTimes
(or once in a helper of each function).runRandom
is the interface between theIO
monad and theRandState
monad. Look at the providedStateExamples.hs
file for a good example of using this interface.
If you find yourself using runState
or runRandom
in every function, or are unpacking the generator in every random function, you are defeating the purpose of the State
monad and will lose points. Handling the transmission of state is bind
’s job. See the tutorial in the next section.
*Instructor hint: in the shuffleNTimes
and rollTwoDiceNTimes
functions you will probably want runState
because runRandom
or evalState
discard the state (the generator), but you will want both the result and the new generator.
Where is the Generator?
What is the type of the provided rand
function?
rand = do
gen <- get
let (x, gen') = random gen
put gen'
return x
Well, we need to somehow get an initial generator into this function, so maybe one of the parameters is a generator.
But as written, this function has no named arguments. And, if you look at the provided code, the type annotation is essentially:
rand :: RandState Int
This function only returns something? How does a generator get into this function? Why isn’t a generator one of the arguments? What’s going on?
The answer has to do with how Haskell represents state.
You might think that, to represent state, Haskell might have some datatype with some fields that could be modified as the program executed. However, Haskell adheres to the discipline of purity: nothing can change once declared.
Haskell takes a stranger route:
newtype State s a = State {
runState :: s -> (a,s)
}
Which is essentially:
newtype State s a = State (s -> (a,s))
So State
is just a wrapper for an s -> (a,s)
function—a function that takes an initial state and returns a value and a new state.
Haskell doesn’t really represent state at all—it represents transformations on state.
Back to our rand
function, we said its type signature was essentially
rand :: RandState Int
From RandState.hs
we see type RandState a = State StdGen a
, so we can de-sugar this:
rand :: State StdGen Int
So the rand
function will return some value that looks like:
State (\gen -> ...some code that returns (int, gen') )
Aha, there’s the missing generator! That’s why the generator is not an argument to the function: the rand
function isn’t actually doing any computation, it’s just building a state transforming function. rand
returns a wrapped function that is waiting for an initial generator.
So how does all this:
rand :: RandState Int
rand = do
gen <- get
let (x, gen') = random gen
put gen'
return x
return a wrapped s -> (a,s)
function?
Remember, do
notation is de-sugared to calls to bind
(>>=
). All monad magic is in bind
. The State
monad’s bind
definition is hard to follow, but you should at least convince yourself that it is “secretly” handling the state for you, setting up a function that sequences the actions:
ma >>= f = State $ \s ->
let (a,t) = runState ma s
mb = f a
(b,u) = runState mb t
in (b,u)
and you should notice that, indeed, it returns a value of the form State (s -> (a, s))
.
So bind
composes our State (s -> (a,s))
actions into more complicated State (s -> (a, s))
actions. The basic returned function is always s -> (a,s)
, a function waiting for that initial state before it will actually do all the computation.
So how might be use all this? How would we write, say, a function that returns two random numbers in a tuple?
randTwo = ...
The first question is: what is the type signature?
If we write the function in sequence, should it return different values? It’s a random number function, so, yes, it should. This means it needs different generators each time which means it should be monadic. We’ll let the RandState
monad hide the transmission of the generators.
randTwo :: ... -> RandState (...)
randTwo = ...
What should be the return value in the monad? Well, just pair of numbers.
randTwo :: ... -> RandState (Int, Int)
randTwo = ...
Do we need to pass any arguments to randTwo
? Well, the initial generator maybe, but wait…no. We’re returning RandState (Int, Int)
which isState StdGen (Int, Int)
which represents values that look likeState (\gen -> ...some code that returns ((Int, Int), gen) )
.
Our randTwo
will return a wrapped function that’s waiting for a generator. We don’t need an explicit generator in our type declaration—it’s part of RandState
. We’re just building up a computation to be performed later.
randTwo :: RandState (Int, Int)
randTwo = ...
Okay, what’s next? Well, we’re in the RandState
monad because we want our state transformations to “automatically” be composed properly by bind
. So this function should be monadic, let’s start by writing do
.
randTwo :: RandState (Int, Int)
randTwo = do
...
Now let’s get our random numbers.
randTwo :: RandState (Int, Int)
randTwo = do
let r1 = randR
let r2 = randR
Hmm. This seems not to work. We want an Int
each time but randR
is a RandState Int
. Maybe we need to use runRandom
to actually execute randR
’s state transformer.
randTwo :: RandState (Int, Int)
randTwo = do
let r1 = runRandom randR gen
let r2 = runRandom randR gen
Oh, now we need a generator. Maybe we can use newStdGen
, wait, no, that’s an IO
monadic function and we’re in a different kind of monad. Oh, maybe we can pull the generator out of the state!
randTwo :: RandState (Int, Int)
randTwo = do
gen <- get
let r1 = runRandom randR gen
let r2 = runRandom randR gen
Oh shoot. It’s using the same generator both times—we’re going to get the same numbers for r1
and r2
…hmmm….
STOP STOP STOP!
Too much work. We’re doing bind
’s job. The whole point of the monad was so that bind
could take care of sequencing the generators for us. Make the code fully monadic. Let the do
notation glue together our actions with bind
.
randTwo :: RandState (Int, Int)
randTwo = do
r1 <- randR
r2 <- randR
Now, behind the scenes, bind
will take care of grabbing the returned generator from the first call and it using as the initial generator for the second call. A good exercise is to de-sugar the do
notation here to bind
calls and think about how the bind
function above does what we want it to.
Last thing we must do is return the pair of random numbers.
randTwo :: RandState (Int, Int)
randTwo = do
r1 <- randR
r2 <- randR
(r1, r2)
Whoops, this doesn’t typecheck. We’re returning (Int, Int)
but we need to return RandState (Int, Int)
. How do wrap a vanilla value in a monad? The return
function:
randTwo :: RandState (Int, Int)
randTwo = do
r1 <- randR
r2 <- randR
return (r1, r2)
There we go.
Most of your code should be similarly simple. You should only explicitly handle generators in randR
, shuffleNTimes
, and rollTwoDiceNTimes
.
To Hand In
Your lab5/
directory should have Lab5.hs
, as well as the un-modified starter files RandState.hs
and State.hs
. Be sure to delete StateExamples.hs
.
You do not need to write tests for this week’s lab.
As usual, use Git to submit, and check Gitlab to confirm that your changes are there. Whatever you have pushed at the deadline is your submission.
Grading Breakdown
- 10%
Lab5
compiles. ($ ghc Lab5.hs
does not error.) - 10% Clean code.
- Parentheses in the correct place:
sqrt (5 * 2)
notsqrt(5 * 2)
. - Type definitions on all top-level functions.
- Understandable names. Naming is one of the two hard problems in computer science. Short names are fine for common uses:
i
for index,x:xs
for generic array elements,n
for generic number,c
for counter, etc. Abbreviations are fine too, but the rule is a reader should know the meaning of the thing just by reading its name. - Type definitions on all top-level functions.
- No commented-out code.
- Avoid too many long lines (>100 characters). A couple long lines are okay.
- Parentheses in the correct place:
sqrt (5 * 2)
notsqrt(5 * 2)
. - Avoid unnecessary code, e.g. pattern match cases that could be combined, or deconstructing a value and then just putting it back together.
- No long, hard-to-follow, or deeply nested functions that should be broken up. Take advantage of
where
andlet
. - Comments explaining any hard-to-understand logic, e.g. what does this helper function do?
- Tasteful use of whitespace for vertical alignment.
- η-reduction is not necessary, but can be tasteful.
- Parentheses in the correct place:
- 10%
newStdGen
only used once. - 10% Code properly monadic:
runRandom
or equivalent only used at the monad boundary betweenIO
andRandState
(e.g. inshuffleNTimes
androllTwoDiceNTimes
). Similarly, the the bind function ofRandState
is used to thread the generator through (except perhaps inshuffleNTimes
androllTwoDiceNTimes
).get
andput
used only once. - 10%
$ ./Lab5 shuffle N
outputs N results (Your lab should be able to output 10,000 shuffles in less than 30 seconds.) - 05%
$ ./Lab5 shuffle N
outputs one result per line - 15%
$ ./Lab5 shuffle N
is properly random - 10%
$ ./Lab5 rollTwoDice N
outputs N results (Your lab should be able to output 10,000 rolls in less than 30 seconds.) - 05%
$ ./Lab5 rollTwoDice N
outputs one result per line - 15%
$ ./Lab5 rollTwoDice N
is properly random
Extra Credit: Approximating Pi (8%)
We will shift gears a bit in this final part of the lab to show another application of randomness. Randomness can be a powerful tool if we want to work out approximate solutions to problems. There is a general class of computational methods called Monte-Carlo methods that are widely used.
One simple example of a problem that can be solved with a Monte-Carlo method is calculating the approximate value of π.
Consider a circle inscribed in a square. If the circle has radius \(r = 1\), then the side length \(l\) of the square is two, and the the ratio of the area of the circle to area of the square is \(\frac{\pi r^2}{l^2} = \frac{\pi 1^2}{2^2} = \frac{\pi}{4}\). If we then choose points at random in the square (i.e. with x and y values in the interval \([-1, 1]\)), then we can approximate π by checking which proportion fall within the circle. The more points are randomly chosen, the closer the ratio will converge to π/4. You can easily check to see if the point is within the circle—less than 1 unit away from (0,0).
Here is a template for the pi approximation code:
-- Succeed if randomly chosen point from square is inside circumscribed circle
piTrial :: RandState Bool
-- Perform n trials of the RandState function provided as the second argument,
-- and give back the number of successful trials
-- Hint: To perform the n trials, you can either use sequence from
-- Control.Monad, or you can use recursion
bernoulliTrials :: Int -> RandState Bool -> RandState Int
-- Approximate pi using n randomly chosen points
-- Hint: You will probably need to use the fromIntegral function to
-- convert Int into Double.
approxPi :: Int -> RandState Double
Make a new file ApproxPi.hs
that runs on the command line, taking the number of number of iterations as an argument, and outputs an approximation of π.
$ ./ApproxPi 10000
3.1588