サンプルを多項分布で生成する場合
黒木玄
2017-09-19
2×2の分割表に関するカイ二乗検定とFisherの正確検定の比較に関する
https://oku.edu.mie-u.ac.jp/~okumura/stat/fisher-chisq.html
の結果(のsubset)をJulia言語で再現してみた。
ついでにG検定の場合の結果も付け加えておいた。
さらに複数の確率分布のケースでテストしてみた。
Distributionsパッケージについては次のリンク先を参照せよ。
https://juliastats.github.io/Distributions.jl/latest/
HypothesisTestsパッケージも存在するが、あえて利用しなかった。
統計関係のパッケージの情報については次のリンク先を参照せよ。
using PyPlot
using Distributions
versioninfo()
Julia Version 0.6.0 Commit 903644385b* (2017-06-19 13:05 UTC) Platform Info: OS: Windows (x86_64-w64-mingw32) CPU: Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz WORD_SIZE: 64 BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell) LAPACK: libopenblas64_ LIBM: libopenlibm LLVM: libLLVM-3.9.1 (ORCJIT, haswell)
# xlog(x,y) = if x == 0 then 0 else x*log(x/y)
#
# Warning:
#
# Don't use x*log(x/y) instead of xlog(x,y).
# Because log(0) = -Inf and 0*Inf = NaN.
#
function xlog(x::Float64, y::Float64)
if x == zero(x)
return zero(x)
else
return x*log(x/y)
end
end
xlog(x,y) = xlog(Float64(x),Float64(y))
x, y = 0, 1
@show x, y
@show x*log(x/y)
@show xlog(x,y);
(x, y) = (0, 1) x * log(x / y) = NaN xlog(x, y) = 0.0
prodprob(px,py) = [px*py, px*(1-py), (1-px)*py, (1-px)*(1-py)]
function expected(a::AbstractArray{T,1}) where T
local n = sum(a)
return [(a[1]+a[2])*(a[1]+a[3])/n, (a[1]+a[2])*(a[2]+a[4])/n,
(a[3]+a[4])*(a[1]+a[3])/n, (a[3]+a[4])*(a[2]+a[4])/n]
end
function chisqtest(a::AbstractArray{Int64,1})
local mu = expected(a)
local chisq = sum((a .- mu).^2 ./mu)
local pval = ccdf(Chisq(1), chisq)
return pval
end
function gtest(a::AbstractArray{Int64,1})
local mu = expected(a)
local g = 2*sum(xlog.(Float64.(a),mu)) # Don't use a.*log(a./mu)
local pval = ccdf(Chisq(1), g)
return pval
end
function pvaluehg(d::Hypergeometric, k::Int64)
local c = params(d)
local amax = min(c[1],c[3])
local p0 = pdf(d, k)
local p1 = 0.0
local pval = 0.0
for j in 0:amax
p1 = pdf(d, j)
pval += ifelse(p1 ≤ p0, p1, 0.0)
end
return min(pval, 1.0)
end
function fishertest(a::AbstractArray{Int64,1})
local d = Hypergeometric(a[1]+a[2], a[3]+a[4], a[1]+a[3])
return pvaluehg(d, a[1])
end
ecdf(pval::AbstractArray{Float64,1}, x::Float64) = count(p -> p ≤ x, pval)/length(pval)
ecdf (generic function with 1 method)
## test
@show x = chisqtest([5, 13, 12, 20])
y = 0.4860562818425576
@show y
abs(x-y)
x = chisqtest([5, 13, 12, 20]) = 0.4860562818425576 y = 0.4860562818425576
0.0
## test
@show x = gtest([5, 13, 12, 20])
y = 0.48251234020228
@show y
abs(x-y)
x = gtest([5, 13, 12, 20]) = 0.4825123402022797 y = 0.48251234020228
3.3306690738754696e-16
## test
@show x = fishertest([5, 13, 12, 20])
y = 0.5482810404890348
@show y
abs(x-y)
x = fishertest([5, 13, 12, 20]) = 0.548281040489035 y = 0.5482810404890348
2.220446049250313e-16
function plotECDs(x, ecdf_chisq, ecdf_g, ecdf_fisher)
plot(x, ecdf_chisq.(x), label="\$\\chi^2\$ test", ls="-")
plot(x, ecdf_g.(x), label="G-test", ls="--")
plot(x, ecdf_fisher.(x), label="Fisher test", ls=":")
plot(x, x, label="y = x", color="black", lw=0.5)
legend(fontsize=8)
xlabel("x")
ylabel("y = probability(p-value \$\\leqq\$ x)")
title("empirical cumulative distributions", fontsize=10)
end
function plotErrors(x, ecdf_chisq, ecdf_g, ecdf_fisher)
plot(x, ecdf_chisq.(x) .-x, label="error of \$\\chi^2\$ test", ls="-")
plot(x, ecdf_g.(x) .-x, label="error of G-test", ls="--")
plot(x, ecdf_fisher.(x).-x, label="error of Fisher test", ls=":")
plot(x, zeros(x), color="black", lw=0.5)
legend(fontsize=8)
xlabel("x")
ylabel("y = probability(p-value \$\\leqq\$ x) \$-\$ x")
title("error = probability(p-value \$\\leqq\$ x) \$-\$ x", fontsize=10)
end
function plotcomparison(n, p, x, ecdf_chisq, ecdf_g, ecdf_fisher; ploterrors=true)
if ploterrors
figure(figsize=(10,4))
ax1 = subplot(121)
plotECDs(x, ecdf_chisq, ecdf_g, ecdf_fisher)
ax1[:set_aspect]("equal")
ax2 = subplot(122)
plotErrors(x, ecdf_chisq, ecdf_g, ecdf_fisher)
suptitle("n = $n, n*p = $(reshape(n*p,2,2)')")
else
fig = figure(figsize=(6,4))
plotECDs(x, ecdf_chisq, ecdf_g, ecdf_fisher)
axes()[:set_aspect]("equal")
end
end
function comparisonpval(n::Int64, px::Float64, py::Float64; N = 10^5, xmaxlist = [1.0, 0.1, 0.02])
return comparisonpval(n, prodprob(px,py), N = N, xmaxlist = xmaxlist)
end
function comparisonpval(n::Int64, p::AbstractArray{Float64};
N = 10^5, xmaxlist = [1.0, 0.1, 0.02], ploterrors = true)
local a = rand(Multinomial(n,p), N)
local pvalchisq = [chisqtest(a[:,i]) for i in 1:size(a,2)]
local pvalg = [gtest(a[:,i]) for i in 1:size(a,2)]
local pvalfisher = [fishertest(a[:,i]) for i in 1:size(a,2)]
local ecdf_chisq(x) = ecdf(pvalchisq, x)
local ecdf_g(x) = ecdf(pvalg, x)
local ecdf_fisher(x) = ecdf(pvalfisher, x)
for xmax in xmaxlist
x = linspace(0.0, xmax, 101)
plotcomparison(n, p, x, ecdf_chisq, ecdf_g, ecdf_fisher, ploterrors=ploterrors)
end
end
comparisonpval (generic function with 2 methods)
https://oku.edu.mie-u.ac.jp/~okumura/stat/fisher-chisq.html
以下ではG検定の場合も付け加えてあり、プロットの範囲が1.0まで全部、0.1まで、0.02までの3つの場合をプロットしてある。
n = 50
px, py = 0.4, 0.3
@time comparisonpval(n, px, py)
4.084475 seconds (3.42 M allocations: 169.830 MiB, 0.75% gc time)
$n$ を固定し、2×2の確率の表を
$$\displaystyle P = \begin{bmatrix} p_x p_y & p_x(1-p_y) \\ (1-p_x)p_y & (1-p_x)(1-p_y) \\ \end{bmatrix} $$て定義し、$p_x,p_y$ を $0.1$ 刻みで $0.1\leqq p_x\leqq p_y\leqq 0.5$ の範囲を動かす。
上の形の $P$ のスカラー倍で表示できる行列は独立型であると言うことにする。
確率の表 $P$ と $n$ が定める多項分布に従って、整数行列
$$\displaystyle A = \begin{bmatrix} a & b \\ c & d \\ \end{bmatrix} $$を生成し、この行列を2×2の分割表とみなして、カイ二乗検定、G検定、Fisherの正確検定の方法に従って、 p 値を求める。
そのようにして求めた p 値は行列 $A$ が独立型から離れれば離れるほど小さくなる。
このような計算を $N=10^5$ 回繰り返して、 p 値が $x$ 以下になる確率を求め、それをプロットする。
理想的には p 値が $x$ 以下になる確率は $x$ に一致することが望ましいが、実際には、
(p 値が $x$ 以下になる確率) $-$ $x$
で定義される誤差が生じる。その誤差もプロットする。
有意水準5%での精度に興味のある人は多いだろう。
そこでまず x = 0.0~0.1 の範囲をプロットしよう。
45度線に近いほど精度が高いと考えられる。
n = 25, 50, 100 の場合を扱う。
n = 25 で px = 0.1 の極端な場合を除けば、カイ二乗検定(太実線)の精度が最も高いように見える。
n = 25
for px in 0.1:0.1:0.5
for py in px:0.1:0.5
@time comparisonpval(n, px, py, xmaxlist=[0.1])
end
end
sleep(0.2)
0.550569 seconds (2.20 M allocations: 106.122 MiB, 2.61% gc time) 0.600330 seconds (2.20 M allocations: 106.110 MiB, 1.23% gc time) 0.646636 seconds (2.20 M allocations: 106.110 MiB, 0.85% gc time) 0.719684 seconds (2.20 M allocations: 106.110 MiB, 16.81% gc time) 0.585032 seconds (2.20 M allocations: 106.110 MiB, 1.05% gc time) 0.639246 seconds (2.20 M allocations: 106.110 MiB, 0.90% gc time) 0.716844 seconds (2.20 M allocations: 106.110 MiB, 0.78% gc time) 0.767485 seconds (2.20 M allocations: 106.110 MiB, 0.88% gc time) 0.760211 seconds (2.20 M allocations: 106.110 MiB, 0.61% gc time) 0.910759 seconds (2.20 M allocations: 106.110 MiB, 0.73% gc time) 0.855745 seconds (2.20 M allocations: 106.110 MiB, 0.72% gc time) 0.849117 seconds (2.20 M allocations: 106.110 MiB, 0.95% gc time) 0.931961 seconds (2.20 M allocations: 106.110 MiB, 0.63% gc time) 1.034661 seconds (2.20 M allocations: 106.110 MiB, 9.44% gc time) 0.991448 seconds (2.20 M allocations: 106.110 MiB, 0.45% gc time)
n = 50
for px in 0.1:0.1:0.5
for py in px:0.1:0.5
@time comparisonpval(n, px, py, xmaxlist=[0.1])
end
end
sleep(0.2)
0.877311 seconds (2.20 M allocations: 106.112 MiB, 0.96% gc time) 0.762804 seconds (2.20 M allocations: 106.110 MiB, 0.88% gc time) 0.786212 seconds (2.20 M allocations: 106.110 MiB, 0.80% gc time) 0.756936 seconds (2.20 M allocations: 106.110 MiB, 0.86% gc time) 0.825353 seconds (2.20 M allocations: 106.110 MiB, 0.82% gc time) 1.003583 seconds (2.20 M allocations: 106.110 MiB, 0.67% gc time) 1.061235 seconds (2.20 M allocations: 106.110 MiB, 0.61% gc time) 1.108900 seconds (2.20 M allocations: 106.110 MiB, 0.67% gc time) 1.122526 seconds (2.20 M allocations: 106.110 MiB, 0.58% gc time) 1.372241 seconds (2.20 M allocations: 106.110 MiB, 7.55% gc time) 1.397477 seconds (2.20 M allocations: 106.110 MiB, 0.48% gc time) 1.416365 seconds (2.20 M allocations: 106.110 MiB, 0.41% gc time) 1.621613 seconds (2.20 M allocations: 106.110 MiB, 0.37% gc time) 1.716732 seconds (2.20 M allocations: 106.110 MiB, 0.27% gc time) 1.841540 seconds (2.20 M allocations: 106.110 MiB, 0.32% gc time)
n = 100
for px in 0.1:0.1:0.5
for py in px:0.1:0.5
@time comparisonpval(n, px, py, xmaxlist=[0.1])
end
end
sleep(0.2)
1.085354 seconds (2.20 M allocations: 106.115 MiB, 0.76% gc time) 1.056799 seconds (2.20 M allocations: 106.110 MiB, 0.71% gc time) 1.090018 seconds (2.20 M allocations: 106.110 MiB, 0.59% gc time) 1.076235 seconds (2.20 M allocations: 106.110 MiB, 0.69% gc time) 1.051345 seconds (2.20 M allocations: 106.110 MiB, 0.58% gc time) 1.681336 seconds (2.20 M allocations: 106.110 MiB, 5.94% gc time) 1.779170 seconds (2.20 M allocations: 106.110 MiB, 0.33% gc time) 1.778485 seconds (2.20 M allocations: 106.110 MiB, 0.34% gc time) 1.852405 seconds (2.20 M allocations: 106.110 MiB, 0.34% gc time) 2.293237 seconds (2.20 M allocations: 106.120 MiB, 0.26% gc time) 2.512785 seconds (2.20 M allocations: 106.110 MiB, 0.24% gc time) 2.494513 seconds (2.20 M allocations: 106.110 MiB, 0.21% gc time) 2.975737 seconds (2.20 M allocations: 106.110 MiB, 0.20% gc time) 3.296730 seconds (2.20 M allocations: 106.110 MiB, 0.20% gc time) 3.513900 seconds (2.20 M allocations: 106.110 MiB, 0.18% gc time)
n = 25
for px in 0.1:0.1:0.5
for py in px:0.1:0.5
@time comparisonpval(n, px, py, xmaxlist=[1.0])
end
end
sleep(0.2)
0.689197 seconds (2.20 M allocations: 106.112 MiB, 1.58% gc time) 0.654032 seconds (2.20 M allocations: 106.110 MiB, 16.53% gc time) 0.560230 seconds (2.20 M allocations: 106.110 MiB, 1.03% gc time) 0.538139 seconds (2.20 M allocations: 106.110 MiB, 1.04% gc time) 0.561455 seconds (2.20 M allocations: 106.110 MiB, 0.76% gc time) 0.648619 seconds (2.20 M allocations: 106.110 MiB, 1.15% gc time) 0.729799 seconds (2.20 M allocations: 106.110 MiB, 0.81% gc time) 0.718673 seconds (2.20 M allocations: 106.110 MiB, 0.85% gc time) 0.765250 seconds (2.20 M allocations: 106.110 MiB, 0.78% gc time) 0.798651 seconds (2.20 M allocations: 106.110 MiB, 0.70% gc time) 0.848735 seconds (2.20 M allocations: 106.110 MiB, 0.57% gc time) 0.864701 seconds (2.20 M allocations: 106.110 MiB, 0.76% gc time) 1.067921 seconds (2.20 M allocations: 106.110 MiB, 9.93% gc time) 0.954935 seconds (2.20 M allocations: 106.110 MiB, 0.66% gc time) 1.004631 seconds (2.20 M allocations: 106.110 MiB, 0.58% gc time)
n = 50
for px in 0.1:0.1:0.5
for py in px:0.1:0.5
@time comparisonpval(n, px, py, xmaxlist=[1.0])
end
end
sleep(0.2)
0.724750 seconds (2.20 M allocations: 106.110 MiB, 1.25% gc time) 0.757153 seconds (2.20 M allocations: 106.110 MiB, 0.79% gc time) 0.757605 seconds (2.20 M allocations: 106.110 MiB, 0.74% gc time) 0.814294 seconds (2.20 M allocations: 106.110 MiB, 0.74% gc time) 0.746321 seconds (2.20 M allocations: 106.110 MiB, 0.85% gc time) 0.970986 seconds (2.20 M allocations: 106.110 MiB, 0.74% gc time) 1.188753 seconds (2.20 M allocations: 106.112 MiB, 0.55% gc time) 1.083303 seconds (2.20 M allocations: 106.110 MiB, 0.55% gc time) 1.186058 seconds (2.20 M allocations: 106.110 MiB, 8.41% gc time) 1.301741 seconds (2.20 M allocations: 106.110 MiB, 0.34% gc time) 1.390584 seconds (2.20 M allocations: 106.110 MiB, 0.42% gc time) 1.425003 seconds (2.20 M allocations: 106.110 MiB, 0.44% gc time) 1.632366 seconds (2.20 M allocations: 106.110 MiB, 0.35% gc time) 1.715535 seconds (2.20 M allocations: 106.110 MiB, 0.34% gc time) 1.928616 seconds (2.20 M allocations: 106.110 MiB, 0.31% gc time)
n = 100
for px in 0.1:0.1:0.5
for py in px:0.1:0.5
@time comparisonpval(n, px, py, xmaxlist=[1.0])
end
end
sleep(0.2)
0.936060 seconds (2.20 M allocations: 106.110 MiB, 0.95% gc time) 1.073932 seconds (2.20 M allocations: 106.110 MiB, 0.60% gc time) 1.123809 seconds (2.20 M allocations: 106.110 MiB, 0.46% gc time) 1.117452 seconds (2.20 M allocations: 106.110 MiB, 0.56% gc time) 1.199681 seconds (2.20 M allocations: 106.110 MiB, 9.74% gc time) 1.748188 seconds (2.20 M allocations: 106.117 MiB, 0.33% gc time) 1.754585 seconds (2.20 M allocations: 106.110 MiB, 0.32% gc time) 1.772892 seconds (2.20 M allocations: 106.110 MiB, 0.34% gc time) 1.845353 seconds (2.20 M allocations: 106.110 MiB, 0.26% gc time) 2.454991 seconds (2.20 M allocations: 106.110 MiB, 0.31% gc time) 2.588035 seconds (2.20 M allocations: 106.110 MiB, 0.23% gc time) 2.480909 seconds (2.20 M allocations: 106.110 MiB, 0.28% gc time) 3.010510 seconds (2.20 M allocations: 106.110 MiB, 0.20% gc time) 3.114921 seconds (2.20 M allocations: 106.110 MiB, 0.18% gc time) 3.439553 seconds (2.20 M allocations: 106.110 MiB, 0.14% gc time)
p 値(次の計算中の pval
)が小さいほど独立型からかけ離れていると考えられる。
n = 50
r = [0,0,0,n]
for i in 1:100
r = rand(Multinomial(n,4))
pval = chisqtest(r)
if pval < 0.05
println("pval = ", pval)
println("r = ", r)
break
end
end
p = r/sum(r)
println("p = ", p)
pval = 0.007862025239525573 r = [19, 11, 5, 15] p = [0.38, 0.22, 0.1, 0.3]
当然のことながら、独立型からかけ離れていると p 値は小さくなりやすくなる。
n = 50
r = [9, 15, 17, 9]
@show chisqtest(r)
@show p = r/n
@time comparisonpval(n, p, xmaxlist=[1.0], ploterrors=false)
chisqtest(r) = 0.04863928767942199 p = r / n = [0.18, 0.3, 0.34, 0.18]
1.951615 seconds (2.20 M allocations: 106.129 MiB, 6.13% gc time)
n = 50
r = [16, 11, 6, 17]
@show chisqtest(r)
@show p = r/n
@time comparisonpval(n, p, xmaxlist=[1.0], ploterrors=false)
chisqtest(r) = 0.018515907275267696 p = r / n = [0.32, 0.22, 0.12, 0.34]
1.775243 seconds (2.20 M allocations: 106.071 MiB, 0.35% gc time)
n = 50
r = [5, 16, 18, 11]
@show chisqtest(r)
@show p = r/n
@time comparisonpval(n, p, xmaxlist=[1.0], ploterrors=false)
chisqtest(r) = 0.007382367281393829 p = r / n = [0.1, 0.32, 0.36, 0.22]
1.730845 seconds (2.20 M allocations: 106.071 MiB, 0.40% gc time)