豊田『瀕死本』の5.4節について¶

黒木玄

2020-03-22, 2020-03-31

図6.2の再現については次のノートを参照せよ.

豊田『瀕死本』図6.2について

@time using Distributions
@time using StatsPlots
pal = palette(:default)
pyplot(
    titlefont = font("IPAPGothic", 11)
)
relax(t=0.2) = (backend() == Plots.PyPlotBackend() && PyPlot.clf(); sleep(t))

@time using Base64
displayfile(mime, file; tag="img") = open(file) do f
    base64 = base64encode(f)
    display("text/html", """<$(tag) src="data:$(mime);base64,$(base64)"/>""")
end

  5.724834 seconds (9.27 M allocations: 454.195 MiB, 4.72% gc time)
 19.298540 seconds (22.69 M allocations: 1.143 GiB, 4.23% gc time)
  0.103228 seconds (139.41 k allocations: 6.963 MiB)

Out[1]:

displayfile (generic function with 1 method)

P値に関する解説¶

https://twitter.com/genkuroki/status/1243022720637386752

In [2]:

displayfile("image/jpg", "pval1.jpg")

In [3]:

displayfile("image/jpg", "pval2.jpg")

第5.4節の図の再現¶

In [4]:

n = 1560
k = 829
@show n, k;

(n, k) = (1560, 829)

In [5]:

posterior(; n=n, k=k, a=1.0, b=1.0) = Beta(a+k, b+n-k)
pdf_posterior(p; n=n, k=k, a=1.0, b=1.0) = pdf(posterior(; n=n, k=k, a=a, b=b), p)
cdf_posterior(p; n=n, k=k, a=1.0, b=1.0) = cdf(posterior(; n=n, k=k, a=a, b=b), p)
ccdf_posterior(p; n=n, k=k, a=1.0, b=1.0) = ccdf(posterior(; n=n, k=k, a=a, b=b), p)
function phc_near_p₀(c; p₀=0.5, n=n, k=k, a=1.0, b=1.0)
    d = posterior(; n=n, k=k, a=a, b=b)
    cdf(d, p₀ + c) - cdf(d, p₀ - c)
end

Out[5]:

phc_near_p₀ (generic function with 1 method)

In [6]:

Fig_5_2 = plot(size=(600, 200), legend=false)
plot!(xlim=(0.499, 0.551), ylim=(-0.05, 1.05))
plot!(xlabel="c", ylabel="phc")
plot!(xtick=0:0.01:1, ytick=0:0.2:1)
plot!(ccdf_posterior, 0.50, 0.55; c=pal[1])
plot!([0.51, 0.51, 0.50], [0, ccdf_posterior(0.51), ccdf_posterior(0.51)]; 
    ls=:dash, c=:black, lw=0.7)
plot!(title="図5.2: phc曲線 (n=$n, k=$k)")

Out[6]:

In [7]:

Fig_5_3 = plot(size=(600, 200), legend=false)
plot!(xlim=(-0.002, 0.102), ylim=(-0.05, 1.05))
plot!(xlabel="c", ylabel="phc")
plot!(xtick=0:0.01:1, ytick=0:0.2:1)
plot!(legendfontsize=12)
plot!(phc_near_p₀, 0, 0.10; c=pal[1])
plot!([0.05, 0.05, 0], [0, phc_near_p₀(0.05), phc_near_p₀(0.05)];
    ls=:dash, c=:black, lw=0.7)
plot!(title="図5.3: \$\\mathrm{phc}(|p - 0.5| < c)\$ の曲線 (n=$n, k=$k)")

Out[7]:

In [8]:

P1 = plot(size=(600, 200), legend=false)
plot!(xlim=(-0.01, 0.61), ylim=(-0.05, 1.05))
plot!(xlabel="c", ylabel="phc")
plot!(xtick=0:0.1:1, ytick=0:0.2:1)
plot!(c -> cdf_posterior(c; n=24, k=7), 0, 0.6; c=pal[1])
plot!(title="図5.4上段: phc(p < c) for n=24, k=7")

P2 = plot(size=(600, 200), legend=false)
plot!(xlim=(-0.01, 0.61), ylim=(-0.05, 1.05))
plot!(xlabel="c", ylabel="phc")
plot!(xtick=0:0.1:1, ytick=0:0.2:1)
plot!(c -> cdf_posterior(c; n=2400, k=700), 0, 0.6; c=pal[1])
plot!(title="図5.4下段: phc(p < c) for n=2400, k=700")

Fig_5_4 = plot(P1, P2; size=(600, 300), layout=(2,1))

Out[8]:

対応するP値によって同様の図を再現¶

In [9]:

# ⪅(x, y) = x < y || x ≈ y
# ⪆(x, y) = x ⪅ y

# function pval_twosided(p; n=n, k=k)
#     bin = Binomial(n, p)
#     pval₀ = pdf(bin, k)
#     pval = 0.0
#     for j in support(bin)
#         pval₁ = pdf(bin, j)
#         if pval₁ ⪅ pval₀
#             pval += pval₁
#         end
#     end
#     min(1, pval)
# end

# function pval_doubled_onesided(p; n=n, k=k)
#     bin = Binomial(n, p)
#     min(1, 2cdf(bin, k), 2ccdf(bin, k-1))
# end

# function pval_onesided(p; n=n, k=k)
#     bin = Binomial(n, p)
#     min(cdf(bin, k), ccdf(bin, k-1))
# end

cdf_bin(p; n=n, k=k) = cdf(Binomial(n, p), k)
ccdf_bin(p; n=n, k=k) = ccdf(Binomial(n, p), k-1)

function pval_near_p₀(c; p₀=0.5, n=n, k=k)
    l = Binomial(n, p₀ - c)
    h = Binomial(n, p₀ + c)
    min(cdf(l, k), ccdf(h, k-1))
end

# pval_twosided(0.5), pval_doubled_onesided(0.5), pval_onesided(0.5)

Out[9]:

pval_near_p₀ (generic function with 1 method)

In [10]:

Fig_5_2_pval = plot(size=(600, 200), legend=false)
plot!(xlim=(0.499, 0.551), ylim=(-0.05, 1.05))
plot!(xlabel="c", ylabel="p-value")
plot!(xtick=0:0.01:1, ytick=0:0.2:1)
plot!(c -> cdf_bin(c), 0.50, 0.55; c=pal[2], ls=:dash)
plot!([0.51, 0.51, 0.50], [0, cdf_bin(0.51), cdf_bin(0.51)];
    ls=:dash, c=:black, lw=0.7)
plot!(title="図5.2のP値版 (n=$n, k=$k)")

plot(Fig_5_2, Fig_5_2_pval; size=(600, 400), layout=(2,1))

Out[10]:

phc曲線:

$$ c \mapsto P(\, c < p \,|\, p \sim \operatorname{posterior}(n, k)\,). $$

そのP値版:

$$ c \mapsto P(\, X \le k \,|\, X\sim\operatorname{Binomial}(n, c) \,) $$

In [11]:

Fig_5_2_all = plot(size=(600, 200), legend=false)
plot!(xlim=(0.499, 0.551), ylim=(-0.05, 1.05))
plot!(xlabel="c", ylabel="p-value")
plot!(xtick=0:0.01:1, ytick=0:0.2:1)
plot!(ccdf_posterior, 0.50, 0.55; c=pal[1])
plot!(c -> cdf_bin(c), 0.50, 0.55; c=pal[2], ls=:dash)
plot!(title="図5.2とそのP値版の重ね合わせ (n=$n, k=$k)")

Out[11]:

In [12]:

Fig_5_3_pval = plot(size=(600, 200), legend=false)
plot!(xlim=(-0.002, 0.102), ylim=(-0.05, 1.05))
plot!(xlabel="c", ylabel="phc")
plot!(xtick=0:0.01:1, ytick=0:0.2:1)
plot!(legendfontsize=12)
plot!(pval_near_p₀, 0, 0.10; c=pal[2], ls=:dash)
plot!([0.05, 0.05, 0], [0, pval_near_p₀(0.05), pval_near_p₀(0.05)];
    ls=:dash, c=:black, lw=0.7)
plot!(title="図5.3のP値版 (n=$n, k=$k)")

plot(Fig_5_3, Fig_5_3_pval; size=(600, 400), layout=(2,1))

Out[12]:

phc曲線:

$$ c\mapsto P(\, |p - 0.5| < c \,|\, p \sim \operatorname{posterior}(n, k)\,). $$

そのP値版:

$$ c\mapsto \min\left\{ P(\, X \le k \,|\, X\sim\operatorname{Binomial}(n, 0.5-c) \,),\; P(\, k \le X \,|\, X\sim\operatorname{Binomial}(n, 0.5+c) \,) \right\} $$

In [13]:

Fig_5_3_all = plot(size=(600, 200), legend=false)
plot!(xlim=(-0.002, 0.102), ylim=(-0.05, 1.05))
plot!(xlabel="c", ylabel="phc")
plot!(xtick=0:0.01:1, ytick=0:0.2:1)
plot!(legendfontsize=12)
plot!(phc_near_p₀, 0, 0.10; c=pal[1])
plot!(pval_near_p₀, 0, 0.10; c=pal[2], ls=:dash)
plot!(title="図5.3とそのP値版の重ね合わせ (n=$n, k=$k)")

Out[13]:

In [14]:

Q1 = plot(size=(600, 200), legend=false)
plot!(xlim=(-0.01, 0.61), ylim=(-0.05, 1.05))
plot!(xlabel="c", ylabel="phc")
plot!(xtick=0:0.1:1, ytick=0:0.2:1)
plot!(c -> ccdf_bin(c; n=24, k=7), 0, 0.6; c=pal[2], ls=:dash)
plot!(title="図5.4上段のP値版 for n=24, k=7")

Q2 = plot(size=(600, 200), legend=false)
plot!(xlim=(-0.01, 0.61), ylim=(-0.05, 1.05))
plot!(xlabel="c", ylabel="phc")
plot!(xtick=0:0.1:1, ytick=0:0.2:1)
plot!(c -> ccdf_bin(c; n=2400, k=700), 0, 0.6; c=pal[2], ls=:dash)
plot!(title="図5.4下段のP値版 for n=2400, k=700")

Fig_5_4_pval = plot(Q1, Q2; size=(600, 300), layout=(2,1))

plot(Fig_5_4, Fig_5_4_pval; size=(600, 600), layout=(2,1))

Out[14]:

In [15]:

R1 = plot(size=(600, 200), legend=false)
plot!(xlim=(-0.01, 0.61), ylim=(-0.05, 1.05))
plot!(xlabel="c", ylabel="phc")
plot!(xtick=0:0.1:1, ytick=0:0.2:1)
plot!(c -> cdf_posterior(c; n=24, k=7), 0, 0.6; c=pal[1])
plot!(c -> ccdf_bin(c; n=24, k=7), 0, 0.6; c=pal[2], ls=:dash)
plot!(title="図5.4上段とそのP値版の重ね合わせ for n=24, k=7")

R2 = plot(size=(600, 200), legend=false)
plot!(xlim=(-0.01, 0.61), ylim=(-0.05, 1.05))
plot!(xlabel="c", ylabel="phc")
plot!(xtick=0:0.1:1, ytick=0:0.2:1)
plot!(c -> cdf_posterior(c; n=2400, k=700), 0, 0.6; c=pal[1])
plot!(c -> ccdf_bin(c; n=2400, k=700), 0, 0.6; c=pal[2], ls=:dash)
plot!(title="図5.4下段とそのP値版の重ね合わせ for n=2400, k=700")

Fig_5_4_all = plot(R1, R2; size=(600, 300), layout=(2,1))

Out[15]:

$n$ が小さい場合にはphc曲線とそのP値版の違いが大きめになり, $n$ が大きい場合にはほとんどぴったり一致する.

以下は食い違いが大きな場合.

In [16]:

plot(R1; size=(600, 200))

Out[16]:

次のセルは事前分布のパラメーターを $a=b=0$ にした場合.

In [17]:

S1 = plot(size=(600, 200), legend=false)
plot!(xlim=(-0.01, 0.61), ylim=(-0.05, 1.05))
plot!(xlabel="c", ylabel="phc")
plot!(xtick=0:0.1:1, ytick=0:0.2:1)
plot!(c -> cdf_posterior(c; n=24, k=7, a=0, b=0), 0, 0.6; c=pal[1]) # ← R1との違いはこの行
plot!(c -> ccdf_bin(c; n=24, k=7), 0, 0.6; c=pal[2], ls=:dash)
plot!(title="図5.4上段とそのP値版の重ね合わせ for n=24, k=7, a=b=0")

Out[17]:

次のセルは対応するP値版で $n=25$, $k=8$ にした場合. ほぼぴったり一致している.

In [18]:

T1 = plot(size=(600, 200), legend=false)
plot!(xlim=(-0.01, 0.61), ylim=(-0.05, 1.05))
plot!(xlabel="c", ylabel="phc")
plot!(xtick=0:0.1:1, ytick=0:0.2:1)
plot!(c -> cdf_posterior(c; n=24, k=7), 0, 0.6; c=pal[1])
plot!(c -> ccdf_bin(c; n=25, k=8), 0, 0.6; c=pal[2], ls=:dash) # ← R1との違いはこの行
plot!(title="図5.4上段(n=24, k=7)とそのP値版(n=25, k=8)の重ね合わせ")

Out[18]:

In [ ]:

豊田『瀕死本』の5.4節について¶

目次

P値に関する解説¶

第5.4節の図の再現¶

対応するP値によって同様の図を再現¶