Modelling the recovery process using probability

Recall code from last time for $I_{n+1} = (1 + c) I_n = \lambda * I_n$:

In [3]:
function run_infection(I_0, λ, T=20)  # T=20 is default value

    I = zeros(T)
    I[1] = I_0

    for n in 1:T-1
        I[n+1] = λ * I[n]  
    end

    return I
end
Out[3]:
run_infection (generic function with 3 methods)
In [6]:
run_infection(1.0, 0.1);

Adding new data to extend a vector

In [7]:
v = [1.0]
Out[7]:
1-element Array{Float64,1}:
 1.0
In [8]:
push!(v, 7.0)  # ! means: function `push!` modifies its first argument
Out[8]:
2-element Array{Float64,1}:
 1.0
 7.0
In [10]:
length(v)
Out[10]:
2
In [14]:
a = 1

v[3]
BoundsError: attempt to access 2-element Array{Float64,1} at index [3]

Stacktrace:
 [1] getindex(::Array{Float64,1}, ::Int64) at ./array.jl:787
 [2] top-level scope at In[14]:2
In [15]:
v
Out[15]:
2-element Array{Float64,1}:
 1.0
 7.0
In [16]:
[v; 10]
Out[16]:
3-element Array{Float64,1}:
  1.0
  7.0
 10.0
In [17]:
v
Out[17]:
2-element Array{Float64,1}:
 1.0
 7.0
In [18]:
v2 = [v; 10]
Out[18]:
3-element Array{Float64,1}:
  1.0
  7.0
 10.0
In [19]:
v2 = copy(v)
Out[19]:
3-element Array{Float64,1}:
  1.0
  7.0
 10.0
In [22]:
function run_infection(I_0, λ, T=20)  # T=20 is default value

    Is = [I_0]
    
    I = I_0   # current value of I

    for n in 1:T-1
        I_next = λ * I
        
        push!(Is, I_next)
    end

    return Is
end
Out[22]:
run_infection (generic function with 3 methods)
In [23]:
run_infection(1.0, 1.1)
Out[23]:
20-element Array{Float64,1}:
 1.0
 1.1
 1.1
 1.1
 1.1
 1.1
 1.1
 1.1
 1.1
 1.1
 1.1
 1.1
 1.1
 1.1
 1.1
 1.1
 1.1
 1.1
 1.1
 1.1
In [25]:
function run_infection(I_0, λ, T=20)  # T=20 is default value

    Is = [I_0]
    I = I_0   # current value of I

    for n in 1:T-1
        I_next = λ * I
        
        push!(Is, I_next)
        
        I = I_next
    end
    
    return Is
end
Out[25]:
run_infection (generic function with 3 methods)
In [26]:
run_infection(1.0, 1.1)
Out[26]:
20-element Array{Float64,1}:
 1.0
 1.1
 1.2100000000000002
 1.3310000000000004
 1.4641000000000006
 1.6105100000000008
 1.771561000000001
 1.9487171000000014
 2.1435888100000016
 2.357947691000002
 2.5937424601000023
 2.853116706110003
 3.1384283767210035
 3.4522712143931042
 3.797498335832415
 4.177248169415656
 4.594972986357222
 5.054470284992944
 5.559917313492239
 6.115909044841463
In [27]:
v
Out[27]:
3-element Array{Float64,1}:
  1.0
  7.0
 10.0
In [28]:
pushfirst!(v, 8)   # deque - double-ended queue
Out[28]:
4-element Array{Float64,1}:
  8.0
  1.0
  7.0
 10.0

Randomness

In [29]:
r = rand()
Out[29]:
0.6254465754499758
In [30]:
r = rand()
Out[30]:
0.9251027927183173
In [31]:
using Random
In [33]:
Random.seed!(3);
In [34]:
r = rand()
Out[34]:
0.8116984049958615
In [35]:
r = rand()
Out[35]:
0.9884323655013432
In [36]:
Random.seed!(3);  # mechanism to get repeatable sequences of random numbers
In [37]:
r = rand()
Out[37]:
0.8116984049958615
In [38]:
r = rand()
Out[38]:
0.9884323655013432
In [39]:
rand(10)
Out[39]:
10-element Array{Float64,1}:
 0.8076220876500786
 0.9700908450487538
 0.14006111319509862
 0.5094438024440222
 0.05869740597593154
 0.004257960600515309
 0.9746379934512355
 0.5572251384524507
 0.4644219211852372
 0.2875090558291695
In [40]:
using Plots
In [44]:
r = rand(50);
In [45]:
scatter(r)
Out[45]:
0 10 20 30 40 50 0.2 0.4 0.6 0.8 1.0 y1
In [47]:
one(0.5)
Out[47]:
1.0
In [ ]:

In [48]:
scatter(r, 0.5 .* one.(r), ylim=(0, 1))
Out[48]:
0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 y1
In [52]:
using Interact

num_points = 100
r = rand(num_points)

@manipulate for n in 1:num_points
    scatter(r[1:n], 0.5 .* one.(r[1:n]), ylim=(0, 1), xlim=(0, 1))
end
Out[52]:

Event with probability $p$

In [71]:
p = 0.25
r = rand()

if r < p
    true
else
    false
end
Out[71]:
false

"Bernoulli trial"

In [72]:
function bernoulli(p)

    r = rand()

    if r < p
        return true
    else
        return false
    end
end
Out[72]:
bernoulli (generic function with 1 method)
In [73]:
r < p ? true : false   # ternary operator
Out[73]:
false
In [74]:
r < p
Out[74]:
false
In [75]:
function bernoulli(p)

    r = rand()

    return r < p
end
Out[75]:
bernoulli (generic function with 1 method)
In [80]:
p = 0.25
trials = [bernoulli(p) for i in 1:100];
In [81]:
scatter(trials)
Out[81]:
0 25 50 75 100 0.00 0.25 0.50 0.75 1.00 y1
In [84]:
count(trials)
Out[84]:
25
In [86]:
trials = [bernoulli(p) for i in 1:100];
count(trials)
Out[86]:
29
In [99]:
trials = [bernoulli(p) for i in 1:100];
count(trials)
Out[99]:
30
In [106]:
function bernoulli_experiment(p, N=100)
    trials = [bernoulli(p) for i in 1:N];
    return count(trials)
end
Out[106]:
bernoulli_experiment (generic function with 2 methods)
In [103]:
count(trials .== false) + count(trials)
Out[103]:
100
In [104]:
count(.!(trials))  # ! is not
Out[104]:
70
In [107]:
bernoulli_experiment(0.25)
Out[107]:
29
In [108]:
bernoulli_experiment(0.25)
Out[108]:
33

A function which has different outcomes in different runs is called a random variable

Monte Carlo simulation

Run the same random process a lot of times and look at the results

In [113]:
p = 0.25
N = 20   # num of trials
num_expts = 100 

results = [bernoulli_experiment(p, N) for i in 1:num_expts]
Out[113]:
100-element Array{Int64,1}:
 3
 5
 7
 6
 4
 3
 4
 4
 6
 5
 7
 1
 3
 ⋮
 7
 7
 8
 5
 5
 4
 2
 5
 6
 4
 4
 4
In [115]:
num_expts = 1000
results = [bernoulli_experiment(p, N) for i in 1:num_expts]
scatter(results)
Out[115]:
0 250 500 750 1000 0.0 2.5 5.0 7.5 10.0 12.5 y1

$X$ = number of heads

Probability distribution of a random variable $X$

$\text{Prob}(X = x)$

Probability that $X = x$ = proportion of time that the result was $x$

Counting

Need to count how many of the trials come out as 5

Need data structure where we store the counts:

  • Dict (dictionary)
  • Vector
In [118]:
results
Out[118]:
1000-element Array{Int64,1}:
 7
 6
 3
 3
 6
 8
 5
 4
 1
 4
 6
 7
 9
 ⋮
 7
 7
 3
 3
 5
 9
 4
 6
 3
 7
 7
 0
In [119]:
maximum(results)   # not `max`:   max(1, 2, 3)
Out[119]:
14
In [120]:
minimum(results)
Out[120]:
0
In [121]:
l = maximum(results) + 1  # +1 is to store 0
Out[121]:
15
In [123]:
counts = zeros(l);
In [124]:
for score in results[1:10]   # for i in 1:length(results)
    @show score
end
score = 7
score = 6
score = 3
score = 3
score = 6
score = 8
score = 5
score = 4
score = 1
score = 4
In [125]:
for score in results
    counts[score + 1] += 1   # increment by 1
end
In [126]:
counts
Out[126]:
15-element Array{Float64,1}:
   4.0
  15.0
  81.0
 150.0
 172.0
 210.0
 164.0
 112.0
  53.0
  26.0
   9.0
   1.0
   2.0
   0.0
   1.0
In [128]:
counts = zeros(Int, maximum(results) + 1)

for score in results
    counts[score + 1] += 1   # increment by 1
end
In [129]:
counts
Out[129]:
15-element Array{Int64,1}:
   4
  15
  81
 150
 172
 210
 164
 112
  53
  26
   9
   1
   2
   0
   1
In [136]:
]add LaTeXStrings
   Updating registry at `~/.julia/registries/General`
    
   Updating git-repo `https://github.com/JuliaRegistries/General.git`
    Fetching: [===========[1mFetching: [========================================>]  99.9 %>                    ]  48.1 % [=========================>               ]  61.5 % %>       ]  80.9 %======================================>  ]  94.3 %Fetching: [=======================================> ]  96.7 %
  Resolving package versions...
  Installed OpenBLAS_jll ─ v0.3.9+1
  Installed libass_jll ─── v0.14.0+1
  Installed XSLT_jll ───── v1.1.33+1
  Installed Arpack_jll ─── v3.5.0+3
######################################################################### 100.0%##O=#  #                                                                       
######################################################################### 100.0%##O=#  #                                                                       
######################################################################### 100.0%##O=#  #                                                                       
######################################################################### 100.0%##O=#  #                                                                       
   Updating `~/.julia/environments/v1.4/Project.toml`
  [b964fa9f] + LaTeXStrings v1.1.0
   Updating `~/.julia/environments/v1.4/Manifest.toml`
  [68821587] ↑ Arpack_jll v3.5.0+2 ⇒ v3.5.0+3
  [4536629a] ↑ OpenBLAS_jll v0.3.9+0 ⇒ v0.3.9+1
  [aed1982a] ↑ XSLT_jll v1.1.33+0 ⇒ v1.1.33+1
  [0ac62f75] ↑ libass_jll v0.14.0+0 ⇒ v0.14.0+1
In [141]:
using LaTeXStrings

plot(0:maximum(results), counts, m=:o)
ylabel!(L"frequency of having $n$ heads")
xlabel!(L"n")
Out[141]:
0.0 2.5 5.0 7.5 10.0 12.5 0 50 100 150 200 y1
In [142]:
function count_them(results)

    counts = zeros(Int, maximum(results) + 1)

    for score in results
        counts[score + 1] += 1   # increment by 1
    end
    
    return counts
end
Out[142]:
count_them (generic function with 1 method)
In [143]:
function run_experiments(p, N, num_expts=1000)
    results = [bernoulli_experiment(p, N) for i in 1:num_expts]
end
Out[143]:
run_experiments (generic function with 2 methods)
In [146]:
data = count_them(run_experiments(0.25, 20, 10000))
plot(0:length(data)-1, data, m=:o)
Out[146]:
0.0 2.5 5.0 7.5 10.0 0 500 1000 1500 2000 y1

Probability distribution

Probability = relative frequency

Divide by the number of experiments:

In [149]:
data = count_them(run_experiments(0.25, 20, 10^5))

plot(0:length(data)-1, data ./ sum(data), m=:o)
Out[149]:
0 5 10 15 0.00 0.05 0.10 0.15 0.20 y1
In [150]:
@time data = count_them(run_experiments(0.25, 20, 10^5))
  0.022058 seconds (100.00 k allocations: 11.444 MiB)
Out[150]:
17-element Array{Int64,1}:
   301
  2144
  6744
 13534
 18964
 20304
 16759
 11255
  5977
  2614
  1013
   294
    80
    13
     3
     0
     1
In [151]:
@time data = count_them(run_experiments(0.25, 20, 10^6))
  0.243592 seconds (1.00 M allocations: 114.441 MiB, 7.54% gc time)
Out[151]:
16-element Array{Int64,1}:
   3149
  21130
  67020
 134452
 189680
 201840
 168248
 112460
  60809
  27260
  10008
   3027
    720
    170
     26
      1
In [152]:
@time data = count_them(run_experiments(0.25, 20, 10^7))
  2.596961 seconds (10.00 M allocations: 1.118 GiB, 14.38% gc time)
Out[152]:
17-element Array{Int64,1}:
   31698
  211771
  669302
 1339528
 1896565
 2022336
 1686423
 1124561
  608698
  271036
   98697
   30129
    7440
    1530
     253
      31
       2
In [153]:
num_expts = 10^5
data = count_them(run_experiments(0.25, 20, num_expts))

probs = data ./ num_expts
Out[153]:
15-element Array{Float64,1}:
 0.00329
 0.02179
 0.06524
 0.13388
 0.19079
 0.20136
 0.16894
 0.11272
 0.06134
 0.0273
 0.00957
 0.00301
 0.0006
 0.00015
 2.0e-5
In [154]:
sum(probs)
Out[154]:
0.9999999999999999
In [155]:
x = 3 // 4
Out[155]:
3//4
In [156]:
typeof(x)
Out[156]:
Rational{Int64}
In [157]:
x + x
Out[157]:
3//2
In [158]:
probs = data .// num_expts
Out[158]:
15-element Array{Rational{Int64},1}:
   329//100000
  2179//100000
  1631//25000
  3347//25000
 19079//100000
  2517//12500
  8447//50000
  1409//12500
  3067//50000
   273//10000
   957//100000
   301//100000
     3//5000
     3//20000
     1//50000
In [159]:
sum(probs)
Out[159]:
1//1
In [160]:
data
Out[160]:
15-element Array{Int64,1}:
   329
  2179
  6524
 13388
 19079
 20136
 16894
 11272
  6134
  2730
   957
   301
    60
    15
     2
In [161]:
mean(data)
UndefVarError: mean not defined

Stacktrace:
 [1] top-level scope at In[161]:1
In [162]:
using Statistics
In [167]:
results = run_experiments(0.25, 20, 10^5);
In [168]:
mean(results)  # N=20 trials, p=0.25
Out[168]:
5.00301

Expected mean value = N * p

In [ ]: