import scipy as sp
from scipy import special
from scipy import stats
import matplotlib.pyplot as plt
期間$\mu$ ごとに1回起こるランダムな事象が$n$回起こるまでの時間の分布を表す.
例:「10年に1度の割合でランダムに起こるイベントが3回起こるまでに何年かかるか」という問題は,期待値は30年,
パラメータは,
のガンマ分布に対応する.
help(stats.gamma)
Help on gamma_gen in module scipy.stats._continuous_distns object: class gamma_gen(scipy.stats._distn_infrastructure.rv_continuous) | A gamma continuous random variable. | | %(before_notes)s | | See Also | -------- | erlang, expon | | Notes | ----- | The probability density function for `gamma` is:: | | gamma.pdf(x, a) = x**(a-1) * exp(-x) / gamma(a) | | for ``x >= 0``, ``a > 0``. Here ``gamma(a)`` refers to the gamma function. | | `gamma` has a shape parameter `a` which needs to be set explicitly. | | When ``a`` is an integer, `gamma` reduces to the Erlang | distribution, and when ``a=1`` to the exponential distribution. | | %(after_notes)s | | %(example)s | | Method resolution order: | gamma_gen | scipy.stats._distn_infrastructure.rv_continuous | scipy.stats._distn_infrastructure.rv_generic | builtins.object | | Methods defined here: | | fit(self, data, *args, **kwds) | Return MLEs for shape (if applicable), location, and scale | parameters from data. | | MLE stands for Maximum Likelihood Estimate. Starting estimates for | the fit are given by input arguments; for any arguments not provided | with starting estimates, ``self._fitstart(data)`` is called to generate | such. | | One can hold some parameters fixed to specific values by passing in | keyword arguments ``f0``, ``f1``, ..., ``fn`` (for shape parameters) | and ``floc`` and ``fscale`` (for location and scale parameters, | respectively). | | Parameters | ---------- | data : array_like | Data to use in calculating the MLEs. | args : floats, optional | Starting value(s) for any shape-characterizing arguments (those not | provided will be determined by a call to ``_fitstart(data)``). | No default value. | kwds : floats, optional | Starting values for the location and scale parameters; no default. | Special keyword arguments are recognized as holding certain | parameters fixed: | | - f0...fn : hold respective shape parameters fixed. | Alternatively, shape parameters to fix can be specified by name. | For example, if ``self.shapes == "a, b"``, ``fa``and ``fix_a`` | are equivalent to ``f0``, and ``fb`` and ``fix_b`` are | equivalent to ``f1``. | | - floc : hold location parameter fixed to specified value. | | - fscale : hold scale parameter fixed to specified value. | | - optimizer : The optimizer to use. The optimizer must take ``func``, | and starting position as the first two arguments, | plus ``args`` (for extra arguments to pass to the | function to be optimized) and ``disp=0`` to suppress | output as keyword arguments. | | Returns | ------- | mle_tuple : tuple of floats | MLEs for any shape parameters (if applicable), followed by those | for location and scale. For most random variables, shape statistics | will be returned, but there are exceptions (e.g. ``norm``). | | Notes | ----- | This fit is computed by maximizing a log-likelihood function, with | penalty applied for samples outside of range of the distribution. The | returned answer is not guaranteed to be the globally optimal MLE, it | may only be locally optimal, or the optimization may fail altogether. | | | Examples | -------- | | Generate some data to fit: draw random variates from the `beta` | distribution | | >>> from scipy.stats import beta | >>> a, b = 1., 2. | >>> x = beta.rvs(a, b, size=1000) | | Now we can fit all four parameters (``a``, ``b``, ``loc`` and ``scale``): | | >>> a1, b1, loc1, scale1 = beta.fit(x) | | We can also use some prior knowledge about the dataset: let's keep | ``loc`` and ``scale`` fixed: | | >>> a1, b1, loc1, scale1 = beta.fit(x, floc=0, fscale=1) | >>> loc1, scale1 | (0, 1) | | We can also keep shape parameters fixed by using ``f``-keywords. To | keep the zero-th shape parameter ``a`` equal 1, use ``f0=1`` or, | equivalently, ``fa=1``: | | >>> a1, b1, loc1, scale1 = beta.fit(x, fa=1, floc=0, fscale=1) | >>> a1 | 1 | | Not all distributions return estimates for the shape parameters. | ``norm`` for example just returns estimates for location and scale: | | >>> from scipy.stats import norm | >>> x = norm.rvs(a, b, size=1000, random_state=123) | >>> loc1, scale1 = norm.fit(x) | >>> loc1, scale1 | (0.92087172783841631, 2.0015750750324668) | | ---------------------------------------------------------------------- | Methods inherited from scipy.stats._distn_infrastructure.rv_continuous: | | __init__(self, momtype=1, a=None, b=None, xtol=1e-14, badvalue=None, name=None, longname=None, shapes=None, extradoc=None, seed=None) | Initialize self. See help(type(self)) for accurate signature. | | cdf(self, x, *args, **kwds) | Cumulative distribution function of the given RV. | | Parameters | ---------- | x : array_like | quantiles | arg1, arg2, arg3,... : array_like | The shape parameter(s) for the distribution (see docstring of the | instance object for more information) | loc : array_like, optional | location parameter (default=0) | scale : array_like, optional | scale parameter (default=1) | | Returns | ------- | cdf : ndarray | Cumulative distribution function evaluated at `x` | | expect(self, func=None, args=(), loc=0, scale=1, lb=None, ub=None, conditional=False, **kwds) | Calculate expected value of a function with respect to the | distribution. | | The expected value of a function ``f(x)`` with respect to a | distribution ``dist`` is defined as:: | | ubound | E[x] = Integral(f(x) * dist.pdf(x)) | lbound | | Parameters | ---------- | func : callable, optional | Function for which integral is calculated. Takes only one argument. | The default is the identity mapping f(x) = x. | args : tuple, optional | Shape parameters of the distribution. | loc : float, optional | Location parameter (default=0). | scale : float, optional | Scale parameter (default=1). | lb, ub : scalar, optional | Lower and upper bound for integration. Default is set to the | support of the distribution. | conditional : bool, optional | If True, the integral is corrected by the conditional probability | of the integration interval. The return value is the expectation | of the function, conditional on being in the given interval. | Default is False. | | Additional keyword arguments are passed to the integration routine. | | Returns | ------- | expect : float | The calculated expected value. | | Notes | ----- | The integration behavior of this function is inherited from | `integrate.quad`. | | fit_loc_scale(self, data, *args) | Estimate loc and scale parameters from data using 1st and 2nd moments. | | Parameters | ---------- | data : array_like | Data to fit. | arg1, arg2, arg3,... : array_like | The shape parameter(s) for the distribution (see docstring of the | instance object for more information). | | Returns | ------- | Lhat : float | Estimated location parameter for the data. | Shat : float | Estimated scale parameter for the data. | | isf(self, q, *args, **kwds) | Inverse survival function (inverse of `sf`) at q of the given RV. | | Parameters | ---------- | q : array_like | upper tail probability | arg1, arg2, arg3,... : array_like | The shape parameter(s) for the distribution (see docstring of the | instance object for more information) | loc : array_like, optional | location parameter (default=0) | scale : array_like, optional | scale parameter (default=1) | | Returns | ------- | x : ndarray or scalar | Quantile corresponding to the upper tail probability q. | | logcdf(self, x, *args, **kwds) | Log of the cumulative distribution function at x of the given RV. | | Parameters | ---------- | x : array_like | quantiles | arg1, arg2, arg3,... : array_like | The shape parameter(s) for the distribution (see docstring of the | instance object for more information) | loc : array_like, optional | location parameter (default=0) | scale : array_like, optional | scale parameter (default=1) | | Returns | ------- | logcdf : array_like | Log of the cumulative distribution function evaluated at x | | logpdf(self, x, *args, **kwds) | Log of the probability density function at x of the given RV. | | This uses a more numerically accurate calculation if available. | | Parameters | ---------- | x : array_like | quantiles | arg1, arg2, arg3,... : array_like | The shape parameter(s) for the distribution (see docstring of the | instance object for more information) | loc : array_like, optional | location parameter (default=0) | scale : array_like, optional | scale parameter (default=1) | | Returns | ------- | logpdf : array_like | Log of the probability density function evaluated at x | | logsf(self, x, *args, **kwds) | Log of the survival function of the given RV. | | Returns the log of the "survival function," defined as (1 - `cdf`), | evaluated at `x`. | | Parameters | ---------- | x : array_like | quantiles | arg1, arg2, arg3,... : array_like | The shape parameter(s) for the distribution (see docstring of the | instance object for more information) | loc : array_like, optional | location parameter (default=0) | scale : array_like, optional | scale parameter (default=1) | | Returns | ------- | logsf : ndarray | Log of the survival function evaluated at `x`. | | nnlf(self, theta, x) | Return negative loglikelihood function. | | Notes | ----- | This is ``-sum(log pdf(x, theta), axis=0)`` where `theta` are the | parameters (including loc and scale). | | pdf(self, x, *args, **kwds) | Probability density function at x of the given RV. | | Parameters | ---------- | x : array_like | quantiles | arg1, arg2, arg3,... : array_like | The shape parameter(s) for the distribution (see docstring of the | instance object for more information) | loc : array_like, optional | location parameter (default=0) | scale : array_like, optional | scale parameter (default=1) | | Returns | ------- | pdf : ndarray | Probability density function evaluated at x | | ppf(self, q, *args, **kwds) | Percent point function (inverse of `cdf`) at q of the given RV. | | Parameters | ---------- | q : array_like | lower tail probability | arg1, arg2, arg3,... : array_like | The shape parameter(s) for the distribution (see docstring of the | instance object for more information) | loc : array_like, optional | location parameter (default=0) | scale : array_like, optional | scale parameter (default=1) | | Returns | ------- | x : array_like | quantile corresponding to the lower tail probability q. | | sf(self, x, *args, **kwds) | Survival function (1 - `cdf`) at x of the given RV. | | Parameters | ---------- | x : array_like | quantiles | arg1, arg2, arg3,... : array_like | The shape parameter(s) for the distribution (see docstring of the | instance object for more information) | loc : array_like, optional | location parameter (default=0) | scale : array_like, optional | scale parameter (default=1) | | Returns | ------- | sf : array_like | Survival function evaluated at x | | ---------------------------------------------------------------------- | Methods inherited from scipy.stats._distn_infrastructure.rv_generic: | | __call__(self, *args, **kwds) | Freeze the distribution for the given arguments. | | Parameters | ---------- | arg1, arg2, arg3,... : array_like | The shape parameter(s) for the distribution. Should include all | the non-optional arguments, may include ``loc`` and ``scale``. | | Returns | ------- | rv_frozen : rv_frozen instance | The frozen distribution. | | __getstate__(self) | | __setstate__(self, state) | | entropy(self, *args, **kwds) | Differential entropy of the RV. | | Parameters | ---------- | arg1, arg2, arg3,... : array_like | The shape parameter(s) for the distribution (see docstring of the | instance object for more information). | loc : array_like, optional | Location parameter (default=0). | scale : array_like, optional (continuous distributions only). | Scale parameter (default=1). | | Notes | ----- | Entropy is defined base `e`: | | >>> drv = rv_discrete(values=((0, 1), (0.5, 0.5))) | >>> np.allclose(drv.entropy(), np.log(2.0)) | True | | freeze(self, *args, **kwds) | Freeze the distribution for the given arguments. | | Parameters | ---------- | arg1, arg2, arg3,... : array_like | The shape parameter(s) for the distribution. Should include all | the non-optional arguments, may include ``loc`` and ``scale``. | | Returns | ------- | rv_frozen : rv_frozen instance | The frozen distribution. | | interval(self, alpha, *args, **kwds) | Confidence interval with equal areas around the median. | | Parameters | ---------- | alpha : array_like of float | Probability that an rv will be drawn from the returned range. | Each value should be in the range [0, 1]. | arg1, arg2, ... : array_like | The shape parameter(s) for the distribution (see docstring of the | instance object for more information). | loc : array_like, optional | location parameter, Default is 0. | scale : array_like, optional | scale parameter, Default is 1. | | Returns | ------- | a, b : ndarray of float | end-points of range that contain ``100 * alpha %`` of the rv's | possible values. | | mean(self, *args, **kwds) | Mean of the distribution. | | Parameters | ---------- | arg1, arg2, arg3,... : array_like | The shape parameter(s) for the distribution (see docstring of the | instance object for more information) | loc : array_like, optional | location parameter (default=0) | scale : array_like, optional | scale parameter (default=1) | | Returns | ------- | mean : float | the mean of the distribution | | median(self, *args, **kwds) | Median of the distribution. | | Parameters | ---------- | arg1, arg2, arg3,... : array_like | The shape parameter(s) for the distribution (see docstring of the | instance object for more information) | loc : array_like, optional | Location parameter, Default is 0. | scale : array_like, optional | Scale parameter, Default is 1. | | Returns | ------- | median : float | The median of the distribution. | | See Also | -------- | stats.distributions.rv_discrete.ppf | Inverse of the CDF | | moment(self, n, *args, **kwds) | n-th order non-central moment of distribution. | | Parameters | ---------- | n : int, n >= 1 | Order of moment. | arg1, arg2, arg3,... : float | The shape parameter(s) for the distribution (see docstring of the | instance object for more information). | loc : array_like, optional | location parameter (default=0) | scale : array_like, optional | scale parameter (default=1) | | rvs(self, *args, **kwds) | Random variates of given type. | | Parameters | ---------- | arg1, arg2, arg3,... : array_like | The shape parameter(s) for the distribution (see docstring of the | instance object for more information). | loc : array_like, optional | Location parameter (default=0). | scale : array_like, optional | Scale parameter (default=1). | size : int or tuple of ints, optional | Defining number of random variates (default is 1). | random_state : None or int or ``np.random.RandomState`` instance, optional | If int or RandomState, use it for drawing the random variates. | If None, rely on ``self.random_state``. | Default is None. | | Returns | ------- | rvs : ndarray or scalar | Random variates of given `size`. | | stats(self, *args, **kwds) | Some statistics of the given RV. | | Parameters | ---------- | arg1, arg2, arg3,... : array_like | The shape parameter(s) for the distribution (see docstring of the | instance object for more information) | loc : array_like, optional | location parameter (default=0) | scale : array_like, optional (continuous RVs only) | scale parameter (default=1) | moments : str, optional | composed of letters ['mvsk'] defining which moments to compute: | 'm' = mean, | 'v' = variance, | 's' = (Fisher's) skew, | 'k' = (Fisher's) kurtosis. | (default is 'mv') | | Returns | ------- | stats : sequence | of requested moments. | | std(self, *args, **kwds) | Standard deviation of the distribution. | | Parameters | ---------- | arg1, arg2, arg3,... : array_like | The shape parameter(s) for the distribution (see docstring of the | instance object for more information) | loc : array_like, optional | location parameter (default=0) | scale : array_like, optional | scale parameter (default=1) | | Returns | ------- | std : float | standard deviation of the distribution | | var(self, *args, **kwds) | Variance of the distribution. | | Parameters | ---------- | arg1, arg2, arg3,... : array_like | The shape parameter(s) for the distribution (see docstring of the | instance object for more information) | loc : array_like, optional | location parameter (default=0) | scale : array_like, optional | scale parameter (default=1) | | Returns | ------- | var : float | the variance of the distribution | | ---------------------------------------------------------------------- | Data descriptors inherited from scipy.stats._distn_infrastructure.rv_generic: | | __dict__ | dictionary for instance variables (if defined) | | __weakref__ | list of weak references to the object (if defined) | | random_state | Get or set the RandomState object for generating random variates. | | This can be either None or an existing RandomState object. | | If None (or np.random), use the RandomState singleton used by np.random. | If already a RandomState instance, use it. | If an int, use a new RandomState instance seeded with seed.
special.gamma(1), special.gamma(2)
(1.0, 1.0)
special.gamma(.5), sp.sqrt(sp.pi)
(1.7724538509055159, 1.7724538509055159)
print(special.gamma(3. / 2)) # gamma(1+1/2)=1/2gamma(1/2)
0.5 * sp.sqrt(sp.pi)
0.886226925453
0.88622692545275794
def gamma_pdf(mu, n, x):
x = sp.array(x)
return 1. / (special.gamma(n) * mu**n) * x**(n - 1) * sp.exp(-x / mu)
gamma_pdf(mu=1, n=1, x=[0,.5,1.])
array([ 1. , 0.60653066, 0.36787944])
stats.gamma(a=1, scale=1).pdf([0, .5, 1])
array([ 1. , 0.60653066, 0.36787944])
gamma_pdf(mu=10, n=3, x=[0,.5,1.]), stats.gamma(a=3, scale=10).pdf([0, .5, 1])
(array([ 0. , 0.0001189 , 0.00045242]), array([ 0. , 0.0001189 , 0.00045242]))
%matplotlib inline
x = sp.linspace(0, 10, 200)
mu_s = sp.arange(0.5, 3+.5, .5)
n = 2
for mu in mu_s:
plt.plot(x, stats.gamma(a=n, scale=mu).pdf(x),
label="$\mu={}, n={}$".format(mu, n))
plt.legend()
plt.grid()
plt.show()
n_s = sp.arange(0.5, 3+.5, .5)
mu = 2
for n in n_s:
plt.plot(x, stats.gamma(a=n, scale=mu).pdf(x),
label="$\mu={}, n={}$".format(mu, n))
plt.legend()
plt.grid()
plt.show()
help(stats.gamma.cdf)
Help on method cdf in module scipy.stats._distn_infrastructure: cdf(x, *args, **kwds) method of scipy.stats._continuous_distns.gamma_gen instance Cumulative distribution function of the given RV. Parameters ---------- x : array_like quantiles arg1, arg2, arg3,... : array_like The shape parameter(s) for the distribution (see docstring of the instance object for more information) loc : array_like, optional location parameter (default=0) scale : array_like, optional scale parameter (default=1) Returns ------- cdf : ndarray Cumulative distribution function evaluated at `x`
%matplotlib inline
x = sp.linspace(0, 15, 200)
mu = 2
n = 2
plt.plot(x, stats.gamma(a=n, scale=mu).cdf(x),
label="$\mu={}, n={}$".format(mu, n))
plt.legend()
plt.grid()
plt.show()
参考:確率論 (Probability Theory) 第12週 ポアソン分布と指数分布とガンマ分布, http://stat.inf.uec.ac.jp
$$ E[X] = \int_0^{\infty} xf(x)dx \\ = \int_0^{\infty}x \frac{1}{\Gamma(n)\mu^n} x^{n-1} e^{-\frac{x}{\mu}}dx \\ = \frac{1}{\Gamma(n)\mu^n} \int_0^{\infty}x^{n} e^{-\frac{x}{\mu}}dx $$ここで,$\frac{x}{\mu} = u, \frac{du}{dx} = \frac{1}{\mu}$ に変数変換
\begin{eqnarray*} &=& \frac{1}{\Gamma(n)\mu^n} \int_0^{\infty} (\mu u)^{n} e^{-u} \mu du\\ &=& \frac{\mu^{n+1}}{\Gamma(n)\mu^n} \int_0^{\infty} u^{n} e^{-u} du\\ &=& \frac{\mu^{n+1}}{\Gamma(n)\mu^n} \Gamma(n+1)\\ &=& \frac{\mu^{n+1}}{\mu^n} \frac{\Gamma(n+1)}{\Gamma(n)}\\ &=& \mu \frac{\Gamma(n+1)}{\Gamma(n)}\\ &=& \mu n\\ \end{eqnarray*}ここで,$\frac{x}{\mu} = u, \frac{du}{dx} = \frac{1}{\mu}$ に変数変換
\begin{eqnarray*} &=& \frac{1}{\Gamma(n)\mu^n} \int_0^{\infty} (\mu u)^{n+1} e^{-u} \mu du\\ &=& \frac{\mu^{n+2}}{\Gamma(n)\mu^n} \int_0^{\infty} u^{n+1} e^{-u} du\\ &=& \frac{\mu^{n+2}}{\Gamma(n)\mu^n} \Gamma(n+2)\\ &=& \frac{\mu^{n+2}}{\mu^n} \frac{\Gamma(n+2)}{\Gamma(n)}\\ &=& \mu^2 (n+1)n\\ \end{eqnarray*}よって,
$$ V[X] = E[X^2] - (E[X])^2\\ = \mu^2 (n+1)n - (\mu n)^2\\ = \mu^2 n $$$u = (\frac{1}{\mu}-t)x \to x = \frac{1}{(\frac{1}{\mu}-t)}u$と変数変換する. $du = (\frac{1}{\mu}-t)dx$
$$ = \frac{1}{\Gamma(n)\mu^n} \int_0^{\infty} (\frac{u}{\frac{1}{\mu}-t})^{n-1} e^{-u} \frac{1}{\frac{1}{\mu}-t}du\\ = \frac{1}{\Gamma(n)\mu^n} (\frac{1}{\frac{1}{\mu}-t})^{n} \int_0^{\infty} u^{n-1} e^{-u}du\\ = \frac{1}{\Gamma(n)\mu^n} (\frac{1}{\frac{1}{\mu}-t})^{n} \Gamma(n)\\ = \frac{1}{\mu^n} (\frac{1}{\frac{1}{\mu}-t})^{n} \\ = \frac{1}{(1-\mu t)^{n}} \\ $$指数分布のモーメント母関数
$$ f(x) = \frac{1}{\mu} e^{-\frac{x}{\mu}}\\ M_x(t) = E[e^{tx}] = \int_0^{\infty} e^{tx} \frac{1}{\mu} e^{-\frac{x}{\mu}} dx\\ = \frac{1}{\mu} \int_0^{\infty} e^{(t-\frac{1}{\mu})x} dx\\ $$$X_1+X_2 + \cdots + X_n = Y_n, (X_i \sim Exp(\mu))$ のモーメント母関数は,
$$ M_{Y_n}(t) = E[e^{tY_n}] = E[e^{t \sum{X_i}}] = E[\Pi_i e^{t X_i}] = \Pi_i E[e^{t X_i}] = \frac{1}{(1-\mu t)^n} $$n = 2
mu = 3
x = stats.expon(scale=mu).rvs(size=(10000, n))
Y = x.sum(axis=1)
plt.hist(Y, 100, normed=True,
label="$X_1+X_2, X_i \sim Exp({})$".format(mu))
x = sp.linspace(0, 40, 100)
plt.plot(x, stats.gamma(a=n, scale=mu).pdf(x),
label="$Ga({}, {})$".format(mu, n))
plt.legend()
plt.grid()
plt.show()
分類 | 1回起こる(形状パラメータ$k=1$) | n回起こる(形状パラメータ$k=n>1$) |
---|---|---|
離散 | 幾何分布 | 負の二項分布 |
連続 | 指数分布 | ガンマ分布 |
確率密度関数を求めてみると,
$$ f(x;\mu=2, n=\frac{n}{2}) = \frac{x^{\frac{n}{2}-1}e^{-\frac{x}{2}}}{\Gamma(\frac{n}{2}) 2^{\frac{n}{2}}} $$これは$\chi^2$分布の確率密度関数と一致する.
# 自由度n=3のchi^2分布
n = 3
x = sp.linspace(0, 20, 100)
chi2 = stats.chi2(df=n)
plt.plot(x, chi2.pdf(x), label="$\chi^2(n={})$".format(n))
# Ga(mu=2, n=3/2)
mu = 2
rvs = stats.gamma(a=n / 2., scale=mu).rvs(size=10000)
plt.hist(rvs, 100, normed=True, label="$Ga(\mu={}, n={})$".format(mu, n/2.))
plt.legend()
plt.grid()
plt.show()
ガンマ分布は,正規分布の分散パラメータ$\sigma^2$の逆数である精度の共役事前分布になる.
よって,
$$ \text{事後分布:ガンマ分布}=\text{尤度関数:正規分布}\cdot \text{事前分布:ガンマ分布} $$, stat.inf.uec.ac.jp](http://stat.inf.uec.ac.jp/lib/exe/fetch.php?media=prob:prob-c-note-and-quizzes-20130711.pdf)