A combination of pursuing a Master's in Data Science combined with an awesome trip to Open FSharp in San Francisco convinced me to contribute to this year's FSharp Advent Calendar; my contribution from last year can be found here. One of the workshops presented at Open FSharp entitled "Interactive Computing with F# Jupyter" by Colin Gravill covered making use of IFSharp Notebooks, the F# analog of Python's Jupyter notebooks, for computational experiments; this workshop truly blew my mind by highlighting the amazing capability of utilizing the power of F# in a "Notebook-esque" environment.
Probabilistic Programming has always been a topic that has caused me to break out in metaphorical hives. Therefore, I decided to pursue the selfish goal of trying to demystify this topic as much as possible by writing a blog post. I hope you enjoy it as much as I researching and developing this notebook!
Probabilistic Programming is a software methodology for constructing probabilistic models to be used to make probabilistic inferences.
Simple and succint domain driven modelling techniques in conjunction with the lucid mathematical expressibility of F# makes the language and the concomitant ecosystem a perfect environment to start developing probabilistic models. The two experiments I cover in this blog post that elucidate on these qualities and try to give a taste of Probabilistic Programming are the Monty Hall Problem and Approximate Bayesian Computation.
The Monty Hall Problem involves a contestant at a game show choosing one of three doors in an effort to choose the door that hides a prize. Once the player chooses a door, the game show host opens a door that doesn't have a prize behind it and gives the contestant a choice to switch or stay with the chosen door.
Intuitively, staying with the selected door would make sense since switching wouldn't make a difference to the end result. However, after the host opens a door without the prize behind it, there is more information available to the contestant to make a decision on and hence, this 50/50 chance induced by switching improves the overall favors of winning.
This problem has stumped even the most astute mathematicians such as Paul Erdős and so, seemed like a perfect problem to solve to demonstrate the power of domain driven modelling and constructing probabilistic models in F#. In this section, I plan to develop the Monty Hall game domain and then take a probabilistic approach of proving that switching doors after the reveal is more favorable than staying with the selected door.
Our first step is to implement some building blocks that will be used while developing the domain.
Probability is defined as the likelihood of a given event's occurrence that can be modelled as a double with valid values between 0 and 1 (both inclusive).
type Probability = private Probability of double
module Probability =
let create ( probability : double ) : Probability option =
if probability >= 0.0 && probability <= 1.0 then Some ( Probability probability )
else None
let getProbability ( Probability probability ) = probability
// Creating
let prob = Probability.create ( 1.0 / 2.0 )
// Getting the value
match prob with
| Some p -> Probability.getProbability p
| None -> nan
0.5
A Probability Distribution can be modelled as a sequence of pairs of event occurences and probabilities.
type Distribution<'T> = seq< 'T * Probability >
For the Monty Hall problem we will be making use of the Uniform Distribution as from the perspective of the contestant, the prize could be behind any of the doors available to be opened. The Uniform Distribution assumes an equal and constant probability of all events.
module UniformDistribution =
let private uniformRandomNumberGenerator = System.Random()
let create ( events : seq< int > ) : Distribution<int> =
seq {
let countOfSequence = Seq.length events
let distributionSequence =
events
|> Seq.map( fun e -> e, Probability.create ( 1.0 / double( countOfSequence )))
|> Seq.filter( fun ( e, p ) -> p.IsSome )
|> Seq.map( fun (e, p) -> e, p.Value )
yield! distributionSequence
}
let drawOneFromDistribution ( uniformDistribution : Distribution<int> ) : int * Probability =
let countOfDistribution =
Seq.length uniformDistribution
let idx = uniformRandomNumberGenerator.Next( 0, countOfDistribution )
uniformDistribution
|> Seq.item idx
let drawOneFromEvents ( events : seq< int > ) : int * Probability =
let distribution = create events
drawOneFromDistribution distribution
let diceRollDistribution = UniformDistribution.create [ 1..6 ]
printfn "%A" ( Seq.toList diceRollDistribution )
[(1, Probability 0.1666666667); (2, Probability 0.1666666667); (3, Probability 0.1666666667); (4, Probability 0.1666666667); (5, Probability 0.1666666667); (6, Probability 0.1666666667)]
printfn "%A" ( UniformDistribution.drawOneFromDistribution diceRollDistribution )
(3, Probability 0.1666666667)
The game domain consists of 3 main elements:
type Outcome = Win | Lose
type Door = A | B | C | NA
type GameState = { winningDoor : Door; chosenDoor : Door; openedDoor : Door }
The game states are the following:
let doors = [ A; B; C ]
let startGameState : GameState =
{ winningDoor = NA; chosenDoor = NA; openedDoor = NA }
let hidePrize ( state : GameState ) : GameState =
let winningDoorIdx = fst ( UniformDistribution.drawOneFromEvents ( [ 1..3 ] )) - 1
{ state with winningDoor = doors.[ winningDoorIdx ] }
hidePrize startGameState
{winningDoor = B; chosenDoor = NA; openedDoor = NA;}
The start of the game and hidding of the prize is naturally composed into the initializeGame.
let initializeGame : GameState =
hidePrize startGameState
initializeGame
{winningDoor = C; chosenDoor = NA; openedDoor = NA;}
let chooseDoor ( state : GameState ) ( door : Door ) : GameState =
{ state with chosenDoor = door }
let chooseRandomDoor ( state : GameState ) : GameState =
let chosenDoorIdx = fst ( UniformDistribution.drawOneFromEvents ( [ 1..3 ] )) - 1
{ state with chosenDoor = doors.[ chosenDoorIdx ] }
let chosenRandomDoor = chooseRandomDoor initializeGame
chosenRandomDoor
{winningDoor = C; chosenDoor = C; openedDoor = NA;}
let openDoor ( state : GameState ) : GameState =
// Choose the Non-Winning Door that hasn't been chosen by the contestant.
let doorToOpen =
doors
|> List.except [ state.winningDoor; state.chosenDoor ]
|> List.item 0
{ state with openedDoor = doorToOpen }
let postOpenDoor = openDoor ( chosenRandomDoor )
postOpenDoor
{winningDoor = C; chosenDoor = C; openedDoor = A;}
Next, we define a strategy once the door is opened by host. The two strategies are switching or staying with the chosen door.
type Strategy = GameState -> Outcome
let switch ( state : GameState ) : Outcome =
let doorToSwitchTo =
doors
|> List.except [ state.chosenDoor; state.openedDoor ]
|> List.item 0
if doorToSwitchTo = state.winningDoor then Win
else Lose
let stay ( state : GameState ) : Outcome =
if state.chosenDoor = state.winningDoor then Win
else Lose
printfn "On Switch: %A" ( switch postOpenDoor )
printfn "On Stay: %A" ( stay postOpenDoor )
On Switch: Lose On Stay: Win
Now that we have the domain model and all the game states and strategies well defined, we continue by running 10,000 simulations of the contestant strategy of switching doors or staying with the current door.
let simulationCount = 10000
let simulateMontyHall ( strategy : Strategy ) : Outcome =
let game =
initializeGame
|> chooseRandomDoor
|> openDoor
strategy( game )
simulateMontyHall switch
Win
Now that we have all the pieces in place to run the simulation, we proceed to pull in the XPlot library to help us plot our results via Paket.
#load "Paket.fsx"
Paket.Package([ "XPlot.Plotly" ])
#load "XPlot.Plotly.fsx"
#load "XPlot.Plotly.Paket.fsx"
open XPlot.Plotly
let generateDistributionOfStaying ( numberOfTrials : int ) : Outcome seq =
let mutable list = []
for i in 1 .. numberOfTrials do
list <- list @ [ simulateMontyHall stay ]
Seq.ofList list
let simulationOfStaying =
generateDistributionOfStaying simulationCount
let winCountOfStaying =
simulationOfStaying
|> Seq.filter( fun x -> x = Win )
|> Seq.length
let xAxisStaying = [ "Win"; "Loss" ]
let yAxisStaying = [ winCountOfStaying; ( Seq.length simulationOfStaying - winCountOfStaying )]
let stayingData = List.zip xAxisStaying yAxisStaying
let optionsStaying =
Layout( title = "Outcome Count of Staying" )
stayingData
|> Chart.Bar
|> Chart.WithOptions optionsStaying
|> Chart.WithHeight 500
|> Chart.WithWidth 700
let probabilityOfWinByStaying =
( double winCountOfStaying ) / ( double simulationCount )
printfn "Probability of Winning by Staying with Chosen Door: %A" probabilityOfWinByStaying
Probability of Winning by Staying with Chosen Door: 0.3302
let generateDistributionOfSwitching ( numberOfTrials : int ) : Outcome list =
let mutable list = []
for i in 1 .. numberOfTrials do
list <- list @ [ simulateMontyHall switch ]
list
let simulationOfSwitching =
generateDistributionOfSwitching simulationCount
let winCountOfSwitching =
simulationOfSwitching
|> Seq.filter( fun x -> x = Win )
|> Seq.length
let xAxisSwitching = [ "Win"; "Loss" ]
let yAxisSwitching = [ winCountOfSwitching; ( Seq.length simulationOfSwitching - winCountOfSwitching )]
let switchingData = List.zip xAxisSwitching yAxisSwitching
let optionsSwitching =
Layout( title = "Outcome Count of Switching" )
switchingData
|> Chart.Bar
|> Chart.WithOptions optionsSwitching
|> Chart.WithHeight 500
|> Chart.WithWidth 700
let probabilityOfWinBySwitching =
( double winCountOfSwitching ) / ( double simulationCount )
printfn "Probability of Winning by Switching with Chosen Door: %A" probabilityOfWinBySwitching
Probability of Winning by Switching with Chosen Door: 0.6665
Since the probability of winning by switching the chosen door is higher than that of staying after taking a frequentist approach and sampling the game 10,000 times, we can conclude that switching is best strategy!
This past summer I had the opportunity to attend QConn 2018 in New York and one of the best presentations I attended while I was there was one from Mike Lee Williams entitled "Probabilistic Programming from Scratch". A numerical method called Approximate Bayesian Computation was introduced during the presentation in the context of an A/B Test that highlighted the benefits of taking a Bayesian approach for statistical inference. After realizing how good of a candidate it would be for this blog post, I decided to take Mike's original implementation in Python and demonstrate how easy it is to convert to F#. The video and code covered during the presentation can be found here and here, respectively.
We begin exploring this numerical method by going through some fundamentals.
Bayes Rule, in a nutshell, provides us a way to update our belief about hypothesis 'H' in light of new evidence 'E'. From the formula above, our posterior belief is calculated by multiplying our prior belief by the likelihood and then normalizing the product. In other words, the crux of Bayesian Statistics involves inference based on the posterior distribution on the basis of transforming a prior distribution based on new data points available.
Approximate Bayesian Computation comes into the picture when the new evidence or likelihood to update our prior belief is not available or is infeasible to obtain. Approximate Bayesian Computation is therefore a numerical method that provides a means for generating an approximate posterior distribution on the basis of a prior distribution and a simulation technique.
For this section we'll be making use of FsLab, an awesome conglomerate of F# packages for Data Science specifically for pulling in a library called 'MathNet.Numerics' that has a lot of the distributions we'll make use of.
Paket.Package([ "FsLab" ])
#load "FsLab.fsx"
open MathNet.Numerics.Distributions
The application of Approximate Bayesian Computation described below is the same example introduced in Mike's presentation where we have a website that has two layouts 'A' and 'B' that are waiting to be deployed and we want to figure out which layout draws in more visitors to subscribe to the mailing list. The experiment involves, for each layout, observing the number of visitors and subscribers.
Our goal is to generate the posterior distributions of both Layout A and B. The process of generating the posterior distributions involves sampling points from a prior sampler to simulate an outcome and then compare the simulated outcome to the real outcome we obtained from our experimentation phase. If the simulated point matches the real outcome, that point can be used as a sample from the posterior distribution, if not, that point is rejected and we continue to simulate more points for a considerable number of steps (here, we choose 10,000 steps).
After we run our simulations from a large number of steps, we'll have generated a distributed composed of points that are most likely to represent the experimental observations we made earlier.
The components needed to successfully conduct the A/B Test are:
// Layout A
[<Literal>]
let numberOfVisitorsLayoutA = 100
[<Literal>]
let numberOfSubscribersA = 4
// Layout B
[<Literal>]
let numberOfVisitorsLayoutB = 40
[<Literal>]
let numberOfSubscribersB = 2
[<Literal>]
let numberOfSimulationSteps = 10000
For Layout A, we choose a Uniform Distribution and Normal Distribution for Layout B. The range of these samplers is between 0 and 1.
The underlying distributions were chosen to highlight a difference of prior beliefs of the layouts. For Layout A, the posterior distribution consists of continuous events of equal and constant probability of Layout A or in other words, any outcome is equally likely. Conversely for Layout B, we profess that we do have an opinion about the posterior distribution.
// For Layout A: Uniform Distribution
let uniformPriorSampler : seq< double > =
let continuousUniform = ContinuousUniform( 0.0, 1.0 )
seq { while true do yield continuousUniform.Sample() }
let priorSamplerLayoutA =
uniformPriorSampler
|> Seq.take numberOfSimulationSteps
priorSamplerLayoutA
seq [0.003821128515; 0.9902694891; 0.9856265886; 0.4180470744; ...]
// For Layout B: Normal Distribution
let normalPriorSampler ( mu : double )
( sigma : double ) : seq< double > =
let normalDistribution = Normal( mu, sigma )
seq { while true do
let sample = normalDistribution.Sample()
if sample >= 0.0 && sample <= 1.0 then yield sample
else () }
// We choose the mu and sigma to be some arbitarily alue.
// NOTE: these parameters can be tuned.
let priorSamplerLayoutB =
normalPriorSampler 0.05 0.01 // Expect a right skewed distribution wrt a uniform distribution
|> Seq.take numberOfSimulationSteps
priorSamplerLayoutB
seq [0.05255230341; 0.05534334498; 0.04536012053; 0.03649224388; ...]
let overlaidTrace1Prior =
Histogram(
x = priorSamplerLayoutA,
opacity = 0.75,
name = "Layout A"
)
let overlaidTrace2Prior =
Histogram(
x = priorSamplerLayoutB,
opacity = 0.75,
name = "Layout B"
)
let overlaidLayoutPrior =
Layout(
barmode = "overlay",
title = "Prior Distributions: A vs. B"
)
[overlaidTrace1Prior; overlaidTrace2Prior]
|> Chart.Plot
|> Chart.WithLayout overlaidLayoutPrior
|> Chart.WithWidth 700
|> Chart.WithHeight 500
Our simulation function uses samples from a random number generator based on the number of visitors of a particular layout using the prior sample as a threshold in an effort to get us the simulated number of subscribers.
open System
let randomNumberGenerator = Random()
let simulateSubscription ( priorSample : double )
( nVisitors : int ) : int =
[ 1..nVisitors ]
|> List.map( fun x -> randomNumberGenerator.NextDouble() )
|> List.filter( fun d -> d < priorSample )
|> List.sum
|> int
// Test the Simulated Subscriptions.
printfn "%A" ( simulateSubscription 0.01 numberOfSimulationSteps )
printfn "%A" ( simulateSubscription 0.02 numberOfSimulationSteps )
printfn "%A" ( simulateSubscription 0.03 numberOfSimulationSteps )
0 1 4
let simulate ( priorSample : double )
( numberOfVisitors : int ) : int =
simulateSubscription priorSample numberOfVisitors
let applySimulationLayoutA ( priorSample : double ) : int =
simulate priorSample numberOfVisitorsLayoutA
let applySimulationLayoutB ( priorSample : double ) : int =
simulate priorSample numberOfVisitorsLayoutB
Our posterior distribution is constructed by taking the observed number of subscriptions for a particular layout and running the simulation on a sample of the prior sampler. If there is match between the observed values and the simulated points, we add that prior sample to the sequence representing our posterior distribution.
let posteriorDistribution ( numberOfSubscriptions : int )
( priorSampler : seq<double> )
( simulate : double -> int ) : seq<double> =
seq {
for p in priorSampler do
if simulate p = numberOfSubscriptions then yield p
else ()
}
let posteriorASeq =
posteriorDistribution numberOfSubscribersA uniformPriorSampler applySimulationLayoutA
let posteriorLayoutA =
posteriorASeq
|> Seq.take numberOfSimulationSteps
Histogram( x = posteriorLayoutA )
|> Chart.Plot
|> Chart.WithTitle("Posterior Distribution: A")
|> Chart.WithWidth 700
|> Chart.WithHeight 500
let posteriorBSeq =
posteriorDistribution numberOfSubscribersB uniformPriorSampler applySimulationLayoutB
let posteriorLayoutB =
posteriorBSeq
|> Seq.take numberOfSimulationSteps
posteriorLayoutB
seq [0.3972724911; 0.3654516937; 0.313250709; 0.2779161801; ...]
Histogram( x = posteriorLayoutB )
|> Chart.Plot
|> Chart.WithTitle("Posterior Distribution: B")
|> Chart.WithWidth 700
|> Chart.WithHeight 500
let overlaidTrace1Posterior =
Histogram(
x = posteriorLayoutA,
opacity = 0.75,
name = "Layout A"
)
let overlaidTrace2Posterior =
Histogram(
x = posteriorLayoutB,
opacity = 0.75,
name = "Layout B"
)
let overlaidLayoutPosterior =
Layout(
barmode = "overlay",
title = "Posterior Distributions: A vs. B"
)
[overlaidTrace1Posterior; overlaidTrace2Posterior]
|> Chart.Plot
|> Chart.WithLayout overlaidLayoutPosterior
|> Chart.WithWidth 700
|> Chart.WithHeight 500
It is worth noting that since the posterior distribution of Layout B is further to the right compared to Layout A's posterior distribution indicating the Layout B is more favorable than Layout A.
Our next question is, based on our results, how much more favorable is Layout B compared to Layout A? We can take a bunch of different approaches here but it would be fairly intuitive to measure the fraction of points where Layout B had a higher value than Layout A.
let subscriptionFraction =
let countOfCasesWhereLayoutBIsPrefered =
posteriorLayoutA
|> Seq.zip posteriorLayoutB
|> Seq.filter( fun ( a, b ) -> b > a )
|> Seq.length
|> double
countOfCasesWhereLayoutBIsPrefered / ( double numberOfSimulationSteps )
subscriptionFraction
0.1747
From our result we are 17 - 18% confident that Layout B is better than Layout A.
Approximate Bayesian Computation is an extremely simple numerical method to implement to simulate values in the case the likelihood values aren't available. Mike's presentation was of tremendous help in furthering my understanding of Bayesian Inference using Approximate Bayesian Computation. Additionally, the power of F# to simply express computational procedures made it relatively simple to develop the A/B Testing example.