ガンマ分布

In [1]:
import scipy as sp
from scipy import special
from scipy import stats
import matplotlib.pyplot as plt

意味

期間$\mu$ ごとに1回起こるランダムな事象が$n$回起こるまでの時間の分布を表す.

例:「10年に1度の割合でランダムに起こるイベントが3回起こるまでに何年かかるか」という問題は,期待値は30年,

パラメータは,

  • 尺度母数$\mu = 10$
  • 形状母数$n = 3$

のガンマ分布に対応する.

確率密度関数

\begin{eqnarray} f(x) = \frac{1}{\Gamma(n)\mu^n}x^{n−1}e^{−\frac{x}{\mu}}, (x≥0) \end{eqnarray}
  • パラメータ: $\mu, n > 0$
In [8]:
help(stats.gamma)
Help on gamma_gen in module scipy.stats._continuous_distns object:

class gamma_gen(scipy.stats._distn_infrastructure.rv_continuous)
 |  A gamma continuous random variable.
 |  
 |  %(before_notes)s
 |  
 |  See Also
 |  --------
 |  erlang, expon
 |  
 |  Notes
 |  -----
 |  The probability density function for `gamma` is::
 |  
 |      gamma.pdf(x, a) = x**(a-1) * exp(-x) / gamma(a)
 |  
 |  for ``x >= 0``, ``a > 0``. Here ``gamma(a)`` refers to the gamma function.
 |  
 |  `gamma` has a shape parameter `a` which needs to be set explicitly.
 |  
 |  When ``a`` is an integer, `gamma` reduces to the Erlang
 |  distribution, and when ``a=1`` to the exponential distribution.
 |  
 |  %(after_notes)s
 |  
 |  %(example)s
 |  
 |  Method resolution order:
 |      gamma_gen
 |      scipy.stats._distn_infrastructure.rv_continuous
 |      scipy.stats._distn_infrastructure.rv_generic
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  fit(self, data, *args, **kwds)
 |      Return MLEs for shape (if applicable), location, and scale
 |      parameters from data.
 |      
 |      MLE stands for Maximum Likelihood Estimate.  Starting estimates for
 |      the fit are given by input arguments; for any arguments not provided
 |      with starting estimates, ``self._fitstart(data)`` is called to generate
 |      such.
 |      
 |      One can hold some parameters fixed to specific values by passing in
 |      keyword arguments ``f0``, ``f1``, ..., ``fn`` (for shape parameters)
 |      and ``floc`` and ``fscale`` (for location and scale parameters,
 |      respectively).
 |      
 |      Parameters
 |      ----------
 |      data : array_like
 |          Data to use in calculating the MLEs.
 |      args : floats, optional
 |          Starting value(s) for any shape-characterizing arguments (those not
 |          provided will be determined by a call to ``_fitstart(data)``).
 |          No default value.
 |      kwds : floats, optional
 |          Starting values for the location and scale parameters; no default.
 |          Special keyword arguments are recognized as holding certain
 |          parameters fixed:
 |      
 |          - f0...fn : hold respective shape parameters fixed.
 |            Alternatively, shape parameters to fix can be specified by name.
 |            For example, if ``self.shapes == "a, b"``, ``fa``and ``fix_a``
 |            are equivalent to ``f0``, and ``fb`` and ``fix_b`` are
 |            equivalent to ``f1``.
 |      
 |          - floc : hold location parameter fixed to specified value.
 |      
 |          - fscale : hold scale parameter fixed to specified value.
 |      
 |          - optimizer : The optimizer to use.  The optimizer must take ``func``,
 |            and starting position as the first two arguments,
 |            plus ``args`` (for extra arguments to pass to the
 |            function to be optimized) and ``disp=0`` to suppress
 |            output as keyword arguments.
 |      
 |      Returns
 |      -------
 |      mle_tuple : tuple of floats
 |          MLEs for any shape parameters (if applicable), followed by those
 |          for location and scale. For most random variables, shape statistics
 |          will be returned, but there are exceptions (e.g. ``norm``).
 |      
 |      Notes
 |      -----
 |      This fit is computed by maximizing a log-likelihood function, with
 |      penalty applied for samples outside of range of the distribution. The
 |      returned answer is not guaranteed to be the globally optimal MLE, it
 |      may only be locally optimal, or the optimization may fail altogether.
 |      
 |      
 |      Examples
 |      --------
 |      
 |      Generate some data to fit: draw random variates from the `beta`
 |      distribution
 |      
 |      >>> from scipy.stats import beta
 |      >>> a, b = 1., 2.
 |      >>> x = beta.rvs(a, b, size=1000)
 |      
 |      Now we can fit all four parameters (``a``, ``b``, ``loc`` and ``scale``):
 |      
 |      >>> a1, b1, loc1, scale1 = beta.fit(x)
 |      
 |      We can also use some prior knowledge about the dataset: let's keep
 |      ``loc`` and ``scale`` fixed:
 |      
 |      >>> a1, b1, loc1, scale1 = beta.fit(x, floc=0, fscale=1)
 |      >>> loc1, scale1
 |      (0, 1)
 |      
 |      We can also keep shape parameters fixed by using ``f``-keywords. To
 |      keep the zero-th shape parameter ``a`` equal 1, use ``f0=1`` or,
 |      equivalently, ``fa=1``:
 |      
 |      >>> a1, b1, loc1, scale1 = beta.fit(x, fa=1, floc=0, fscale=1)
 |      >>> a1
 |      1
 |      
 |      Not all distributions return estimates for the shape parameters.
 |      ``norm`` for example just returns estimates for location and scale:
 |      
 |      >>> from scipy.stats import norm
 |      >>> x = norm.rvs(a, b, size=1000, random_state=123)
 |      >>> loc1, scale1 = norm.fit(x)
 |      >>> loc1, scale1
 |      (0.92087172783841631, 2.0015750750324668)
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from scipy.stats._distn_infrastructure.rv_continuous:
 |  
 |  __init__(self, momtype=1, a=None, b=None, xtol=1e-14, badvalue=None, name=None, longname=None, shapes=None, extradoc=None, seed=None)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  cdf(self, x, *args, **kwds)
 |      Cumulative distribution function of the given RV.
 |      
 |      Parameters
 |      ----------
 |      x : array_like
 |          quantiles
 |      arg1, arg2, arg3,... : array_like
 |          The shape parameter(s) for the distribution (see docstring of the
 |          instance object for more information)
 |      loc : array_like, optional
 |          location parameter (default=0)
 |      scale : array_like, optional
 |          scale parameter (default=1)
 |      
 |      Returns
 |      -------
 |      cdf : ndarray
 |          Cumulative distribution function evaluated at `x`
 |  
 |  expect(self, func=None, args=(), loc=0, scale=1, lb=None, ub=None, conditional=False, **kwds)
 |      Calculate expected value of a function with respect to the
 |      distribution.
 |      
 |      The expected value of a function ``f(x)`` with respect to a
 |      distribution ``dist`` is defined as::
 |      
 |                  ubound
 |          E[x] = Integral(f(x) * dist.pdf(x))
 |                  lbound
 |      
 |      Parameters
 |      ----------
 |      func : callable, optional
 |          Function for which integral is calculated. Takes only one argument.
 |          The default is the identity mapping f(x) = x.
 |      args : tuple, optional
 |          Shape parameters of the distribution.
 |      loc : float, optional
 |          Location parameter (default=0).
 |      scale : float, optional
 |          Scale parameter (default=1).
 |      lb, ub : scalar, optional
 |          Lower and upper bound for integration. Default is set to the
 |          support of the distribution.
 |      conditional : bool, optional
 |          If True, the integral is corrected by the conditional probability
 |          of the integration interval.  The return value is the expectation
 |          of the function, conditional on being in the given interval.
 |          Default is False.
 |      
 |      Additional keyword arguments are passed to the integration routine.
 |      
 |      Returns
 |      -------
 |      expect : float
 |          The calculated expected value.
 |      
 |      Notes
 |      -----
 |      The integration behavior of this function is inherited from
 |      `integrate.quad`.
 |  
 |  fit_loc_scale(self, data, *args)
 |      Estimate loc and scale parameters from data using 1st and 2nd moments.
 |      
 |      Parameters
 |      ----------
 |      data : array_like
 |          Data to fit.
 |      arg1, arg2, arg3,... : array_like
 |          The shape parameter(s) for the distribution (see docstring of the
 |          instance object for more information).
 |      
 |      Returns
 |      -------
 |      Lhat : float
 |          Estimated location parameter for the data.
 |      Shat : float
 |          Estimated scale parameter for the data.
 |  
 |  isf(self, q, *args, **kwds)
 |      Inverse survival function (inverse of `sf`) at q of the given RV.
 |      
 |      Parameters
 |      ----------
 |      q : array_like
 |          upper tail probability
 |      arg1, arg2, arg3,... : array_like
 |          The shape parameter(s) for the distribution (see docstring of the
 |          instance object for more information)
 |      loc : array_like, optional
 |          location parameter (default=0)
 |      scale : array_like, optional
 |          scale parameter (default=1)
 |      
 |      Returns
 |      -------
 |      x : ndarray or scalar
 |          Quantile corresponding to the upper tail probability q.
 |  
 |  logcdf(self, x, *args, **kwds)
 |      Log of the cumulative distribution function at x of the given RV.
 |      
 |      Parameters
 |      ----------
 |      x : array_like
 |          quantiles
 |      arg1, arg2, arg3,... : array_like
 |          The shape parameter(s) for the distribution (see docstring of the
 |          instance object for more information)
 |      loc : array_like, optional
 |          location parameter (default=0)
 |      scale : array_like, optional
 |          scale parameter (default=1)
 |      
 |      Returns
 |      -------
 |      logcdf : array_like
 |          Log of the cumulative distribution function evaluated at x
 |  
 |  logpdf(self, x, *args, **kwds)
 |      Log of the probability density function at x of the given RV.
 |      
 |      This uses a more numerically accurate calculation if available.
 |      
 |      Parameters
 |      ----------
 |      x : array_like
 |          quantiles
 |      arg1, arg2, arg3,... : array_like
 |          The shape parameter(s) for the distribution (see docstring of the
 |          instance object for more information)
 |      loc : array_like, optional
 |          location parameter (default=0)
 |      scale : array_like, optional
 |          scale parameter (default=1)
 |      
 |      Returns
 |      -------
 |      logpdf : array_like
 |          Log of the probability density function evaluated at x
 |  
 |  logsf(self, x, *args, **kwds)
 |      Log of the survival function of the given RV.
 |      
 |      Returns the log of the "survival function," defined as (1 - `cdf`),
 |      evaluated at `x`.
 |      
 |      Parameters
 |      ----------
 |      x : array_like
 |          quantiles
 |      arg1, arg2, arg3,... : array_like
 |          The shape parameter(s) for the distribution (see docstring of the
 |          instance object for more information)
 |      loc : array_like, optional
 |          location parameter (default=0)
 |      scale : array_like, optional
 |          scale parameter (default=1)
 |      
 |      Returns
 |      -------
 |      logsf : ndarray
 |          Log of the survival function evaluated at `x`.
 |  
 |  nnlf(self, theta, x)
 |      Return negative loglikelihood function.
 |      
 |      Notes
 |      -----
 |      This is ``-sum(log pdf(x, theta), axis=0)`` where `theta` are the
 |      parameters (including loc and scale).
 |  
 |  pdf(self, x, *args, **kwds)
 |      Probability density function at x of the given RV.
 |      
 |      Parameters
 |      ----------
 |      x : array_like
 |          quantiles
 |      arg1, arg2, arg3,... : array_like
 |          The shape parameter(s) for the distribution (see docstring of the
 |          instance object for more information)
 |      loc : array_like, optional
 |          location parameter (default=0)
 |      scale : array_like, optional
 |          scale parameter (default=1)
 |      
 |      Returns
 |      -------
 |      pdf : ndarray
 |          Probability density function evaluated at x
 |  
 |  ppf(self, q, *args, **kwds)
 |      Percent point function (inverse of `cdf`) at q of the given RV.
 |      
 |      Parameters
 |      ----------
 |      q : array_like
 |          lower tail probability
 |      arg1, arg2, arg3,... : array_like
 |          The shape parameter(s) for the distribution (see docstring of the
 |          instance object for more information)
 |      loc : array_like, optional
 |          location parameter (default=0)
 |      scale : array_like, optional
 |          scale parameter (default=1)
 |      
 |      Returns
 |      -------
 |      x : array_like
 |          quantile corresponding to the lower tail probability q.
 |  
 |  sf(self, x, *args, **kwds)
 |      Survival function (1 - `cdf`) at x of the given RV.
 |      
 |      Parameters
 |      ----------
 |      x : array_like
 |          quantiles
 |      arg1, arg2, arg3,... : array_like
 |          The shape parameter(s) for the distribution (see docstring of the
 |          instance object for more information)
 |      loc : array_like, optional
 |          location parameter (default=0)
 |      scale : array_like, optional
 |          scale parameter (default=1)
 |      
 |      Returns
 |      -------
 |      sf : array_like
 |          Survival function evaluated at x
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from scipy.stats._distn_infrastructure.rv_generic:
 |  
 |  __call__(self, *args, **kwds)
 |      Freeze the distribution for the given arguments.
 |      
 |      Parameters
 |      ----------
 |      arg1, arg2, arg3,... : array_like
 |          The shape parameter(s) for the distribution.  Should include all
 |          the non-optional arguments, may include ``loc`` and ``scale``.
 |      
 |      Returns
 |      -------
 |      rv_frozen : rv_frozen instance
 |          The frozen distribution.
 |  
 |  __getstate__(self)
 |  
 |  __setstate__(self, state)
 |  
 |  entropy(self, *args, **kwds)
 |      Differential entropy of the RV.
 |      
 |      Parameters
 |      ----------
 |      arg1, arg2, arg3,... : array_like
 |          The shape parameter(s) for the distribution (see docstring of the
 |          instance object for more information).
 |      loc : array_like, optional
 |          Location parameter (default=0).
 |      scale : array_like, optional  (continuous distributions only).
 |          Scale parameter (default=1).
 |      
 |      Notes
 |      -----
 |      Entropy is defined base `e`:
 |      
 |      >>> drv = rv_discrete(values=((0, 1), (0.5, 0.5)))
 |      >>> np.allclose(drv.entropy(), np.log(2.0))
 |      True
 |  
 |  freeze(self, *args, **kwds)
 |      Freeze the distribution for the given arguments.
 |      
 |      Parameters
 |      ----------
 |      arg1, arg2, arg3,... : array_like
 |          The shape parameter(s) for the distribution.  Should include all
 |          the non-optional arguments, may include ``loc`` and ``scale``.
 |      
 |      Returns
 |      -------
 |      rv_frozen : rv_frozen instance
 |          The frozen distribution.
 |  
 |  interval(self, alpha, *args, **kwds)
 |      Confidence interval with equal areas around the median.
 |      
 |      Parameters
 |      ----------
 |      alpha : array_like of float
 |          Probability that an rv will be drawn from the returned range.
 |          Each value should be in the range [0, 1].
 |      arg1, arg2, ... : array_like
 |          The shape parameter(s) for the distribution (see docstring of the
 |          instance object for more information).
 |      loc : array_like, optional
 |          location parameter, Default is 0.
 |      scale : array_like, optional
 |          scale parameter, Default is 1.
 |      
 |      Returns
 |      -------
 |      a, b : ndarray of float
 |          end-points of range that contain ``100 * alpha %`` of the rv's
 |          possible values.
 |  
 |  mean(self, *args, **kwds)
 |      Mean of the distribution.
 |      
 |      Parameters
 |      ----------
 |      arg1, arg2, arg3,... : array_like
 |          The shape parameter(s) for the distribution (see docstring of the
 |          instance object for more information)
 |      loc : array_like, optional
 |          location parameter (default=0)
 |      scale : array_like, optional
 |          scale parameter (default=1)
 |      
 |      Returns
 |      -------
 |      mean : float
 |          the mean of the distribution
 |  
 |  median(self, *args, **kwds)
 |      Median of the distribution.
 |      
 |      Parameters
 |      ----------
 |      arg1, arg2, arg3,... : array_like
 |          The shape parameter(s) for the distribution (see docstring of the
 |          instance object for more information)
 |      loc : array_like, optional
 |          Location parameter, Default is 0.
 |      scale : array_like, optional
 |          Scale parameter, Default is 1.
 |      
 |      Returns
 |      -------
 |      median : float
 |          The median of the distribution.
 |      
 |      See Also
 |      --------
 |      stats.distributions.rv_discrete.ppf
 |          Inverse of the CDF
 |  
 |  moment(self, n, *args, **kwds)
 |      n-th order non-central moment of distribution.
 |      
 |      Parameters
 |      ----------
 |      n : int, n >= 1
 |          Order of moment.
 |      arg1, arg2, arg3,... : float
 |          The shape parameter(s) for the distribution (see docstring of the
 |          instance object for more information).
 |      loc : array_like, optional
 |          location parameter (default=0)
 |      scale : array_like, optional
 |          scale parameter (default=1)
 |  
 |  rvs(self, *args, **kwds)
 |      Random variates of given type.
 |      
 |      Parameters
 |      ----------
 |      arg1, arg2, arg3,... : array_like
 |          The shape parameter(s) for the distribution (see docstring of the
 |          instance object for more information).
 |      loc : array_like, optional
 |          Location parameter (default=0).
 |      scale : array_like, optional
 |          Scale parameter (default=1).
 |      size : int or tuple of ints, optional
 |          Defining number of random variates (default is 1).
 |      random_state : None or int or ``np.random.RandomState`` instance, optional
 |          If int or RandomState, use it for drawing the random variates.
 |          If None, rely on ``self.random_state``.
 |          Default is None.
 |      
 |      Returns
 |      -------
 |      rvs : ndarray or scalar
 |          Random variates of given `size`.
 |  
 |  stats(self, *args, **kwds)
 |      Some statistics of the given RV.
 |      
 |      Parameters
 |      ----------
 |      arg1, arg2, arg3,... : array_like
 |          The shape parameter(s) for the distribution (see docstring of the
 |          instance object for more information)
 |      loc : array_like, optional
 |          location parameter (default=0)
 |      scale : array_like, optional (continuous RVs only)
 |          scale parameter (default=1)
 |      moments : str, optional
 |          composed of letters ['mvsk'] defining which moments to compute:
 |          'm' = mean,
 |          'v' = variance,
 |          's' = (Fisher's) skew,
 |          'k' = (Fisher's) kurtosis.
 |          (default is 'mv')
 |      
 |      Returns
 |      -------
 |      stats : sequence
 |          of requested moments.
 |  
 |  std(self, *args, **kwds)
 |      Standard deviation of the distribution.
 |      
 |      Parameters
 |      ----------
 |      arg1, arg2, arg3,... : array_like
 |          The shape parameter(s) for the distribution (see docstring of the
 |          instance object for more information)
 |      loc : array_like, optional
 |          location parameter (default=0)
 |      scale : array_like, optional
 |          scale parameter (default=1)
 |      
 |      Returns
 |      -------
 |      std : float
 |          standard deviation of the distribution
 |  
 |  var(self, *args, **kwds)
 |      Variance of the distribution.
 |      
 |      Parameters
 |      ----------
 |      arg1, arg2, arg3,... : array_like
 |          The shape parameter(s) for the distribution (see docstring of the
 |          instance object for more information)
 |      loc : array_like, optional
 |          location parameter (default=0)
 |      scale : array_like, optional
 |          scale parameter (default=1)
 |      
 |      Returns
 |      -------
 |      var : float
 |          the variance of the distribution
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from scipy.stats._distn_infrastructure.rv_generic:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)
 |  
 |  random_state
 |      Get or set the RandomState object for generating random variates.
 |      
 |      This can be either None or an existing RandomState object.
 |      
 |      If None (or np.random), use the RandomState singleton used by np.random.
 |      If already a RandomState instance, use it.
 |      If an int, use a new RandomState instance seeded with seed.

In [15]:
special.gamma(1), special.gamma(2)
Out[15]:
(1.0, 1.0)
In [17]:
special.gamma(.5), sp.sqrt(sp.pi)
Out[17]:
(1.7724538509055159, 1.7724538509055159)
In [18]:
print(special.gamma(3. / 2))  # gamma(1+1/2)=1/2gamma(1/2)
0.5 * sp.sqrt(sp.pi)
0.886226925453
Out[18]:
0.88622692545275794
In [25]:
def gamma_pdf(mu, n, x):
    x = sp.array(x)
    return 1. / (special.gamma(n) * mu**n) * x**(n - 1) * sp.exp(-x / mu)

gamma_pdf(mu=1, n=1, x=[0,.5,1.])
Out[25]:
array([ 1.        ,  0.60653066,  0.36787944])
In [26]:
stats.gamma(a=1, scale=1).pdf([0, .5, 1])
Out[26]:
array([ 1.        ,  0.60653066,  0.36787944])
In [28]:
gamma_pdf(mu=10, n=3, x=[0,.5,1.]), stats.gamma(a=3, scale=10).pdf([0, .5, 1])
Out[28]:
(array([ 0.        ,  0.0001189 ,  0.00045242]),
 array([ 0.        ,  0.0001189 ,  0.00045242]))
In [36]:
%matplotlib inline

x = sp.linspace(0, 10, 200)
mu_s = sp.arange(0.5, 3+.5, .5)
n = 2
for mu in mu_s:
    plt.plot(x, stats.gamma(a=n, scale=mu).pdf(x), 
             label="$\mu={}, n={}$".format(mu, n))

plt.legend()
plt.grid()
plt.show()


n_s = sp.arange(0.5, 3+.5, .5)
mu = 2
for n in n_s:
    plt.plot(x, stats.gamma(a=n, scale=mu).pdf(x), 
             label="$\mu={}, n={}$".format(mu, n))

plt.legend()
plt.grid()
plt.show()

累積密度関数

In [37]:
help(stats.gamma.cdf)
Help on method cdf in module scipy.stats._distn_infrastructure:

cdf(x, *args, **kwds) method of scipy.stats._continuous_distns.gamma_gen instance
    Cumulative distribution function of the given RV.
    
    Parameters
    ----------
    x : array_like
        quantiles
    arg1, arg2, arg3,... : array_like
        The shape parameter(s) for the distribution (see docstring of the
        instance object for more information)
    loc : array_like, optional
        location parameter (default=0)
    scale : array_like, optional
        scale parameter (default=1)
    
    Returns
    -------
    cdf : ndarray
        Cumulative distribution function evaluated at `x`

In [41]:
%matplotlib inline

x = sp.linspace(0, 15, 200)
mu = 2
n = 2
plt.plot(x, stats.gamma(a=n, scale=mu).cdf(x), 
         label="$\mu={}, n={}$".format(mu, n))

plt.legend()
plt.grid()
plt.show()

期待値

参考:確率論 (Probability Theory) 第12週 ポアソン分布と指数分布とガンマ分布, http://stat.inf.uec.ac.jp

$$ E[X] = \int_0^{\infty} xf(x)dx \\ = \int_0^{\infty}x \frac{1}{\Gamma(n)\mu^n} x^{n-1} e^{-\frac{x}{\mu}}dx \\ = \frac{1}{\Gamma(n)\mu^n} \int_0^{\infty}x^{n} e^{-\frac{x}{\mu}}dx $$

ここで,$\frac{x}{\mu} = u, \frac{du}{dx} = \frac{1}{\mu}$ に変数変換

\begin{eqnarray*} &=& \frac{1}{\Gamma(n)\mu^n} \int_0^{\infty} (\mu u)^{n} e^{-u} \mu du\\ &=& \frac{\mu^{n+1}}{\Gamma(n)\mu^n} \int_0^{\infty} u^{n} e^{-u} du\\ &=& \frac{\mu^{n+1}}{\Gamma(n)\mu^n} \Gamma(n+1)\\ &=& \frac{\mu^{n+1}}{\mu^n} \frac{\Gamma(n+1)}{\Gamma(n)}\\ &=& \mu \frac{\Gamma(n+1)}{\Gamma(n)}\\ &=& \mu n\\ \end{eqnarray*}

分散

$$ E[X^2] = \int_0^{\infty} x^2 f(x)dx \\ = \int_0^{\infty}x^2 \frac{1}{\Gamma(n)\mu^n} x^{n-1} e^{-\frac{x}{\mu}}dx \\ = \frac{1}{\Gamma(n)\mu^n} \int_0^{\infty}x^{n+1} e^{-\frac{x}{\mu}}dx $$

ここで,$\frac{x}{\mu} = u, \frac{du}{dx} = \frac{1}{\mu}$ に変数変換

\begin{eqnarray*} &=& \frac{1}{\Gamma(n)\mu^n} \int_0^{\infty} (\mu u)^{n+1} e^{-u} \mu du\\ &=& \frac{\mu^{n+2}}{\Gamma(n)\mu^n} \int_0^{\infty} u^{n+1} e^{-u} du\\ &=& \frac{\mu^{n+2}}{\Gamma(n)\mu^n} \Gamma(n+2)\\ &=& \frac{\mu^{n+2}}{\mu^n} \frac{\Gamma(n+2)}{\Gamma(n)}\\ &=& \mu^2 (n+1)n\\ \end{eqnarray*}

よって,

$$ V[X] = E[X^2] - (E[X])^2\\ = \mu^2 (n+1)n - (\mu n)^2\\ = \mu^2 n $$

ガンマ分布のモーメント母関数

$$ M_{\Gamma{(\mu, n)}}(t) = E_{\Gamma(\mu, n)}[\exp{(tX)}]\\ = \int_0^{\infty}e^{tx}\frac{1}{\Gamma(n)\mu^n} x^{n-1} e^{-\frac{x}{\mu}}dx\\ = \frac{1}{\Gamma(n)\mu^n} \int_0^{\infty}e^{tx} x^{n-1} e^{-\frac{x}{\mu}}dx\\ = \frac{1}{\Gamma(n)\mu^n} \int_0^{\infty} x^{n-1} e^{-(\frac{1}{\mu}-t)x}dx $$

$u = (\frac{1}{\mu}-t)x \to x = \frac{1}{(\frac{1}{\mu}-t)}u$と変数変換する. $du = (\frac{1}{\mu}-t)dx$

$$ = \frac{1}{\Gamma(n)\mu^n} \int_0^{\infty} (\frac{u}{\frac{1}{\mu}-t})^{n-1} e^{-u} \frac{1}{\frac{1}{\mu}-t}du\\ = \frac{1}{\Gamma(n)\mu^n} (\frac{1}{\frac{1}{\mu}-t})^{n} \int_0^{\infty} u^{n-1} e^{-u}du\\ = \frac{1}{\Gamma(n)\mu^n} (\frac{1}{\frac{1}{\mu}-t})^{n} \Gamma(n)\\ = \frac{1}{\mu^n} (\frac{1}{\frac{1}{\mu}-t})^{n} \\ = \frac{1}{(1-\mu t)^{n}} \\ $$

ガンマ分布と指数分布の比較

  • 指数分布:期間$\mu$ごとに1回起こるまでの時間の分布
  • $X_1, X_2, \cdots , X_n$ が互いに独立に平均$\mu$ の指数分布に従うとき,$Y = X_1 + X_2 + \cdots + X_n$はパラメータ$\mu, n$のガンマ分布に従う.

指数分布のモーメント母関数

$$ f(x) = \frac{1}{\mu} e^{-\frac{x}{\mu}}\\ M_x(t) = E[e^{tx}] = \int_0^{\infty} e^{tx} \frac{1}{\mu} e^{-\frac{x}{\mu}} dx\\ = \frac{1}{\mu} \int_0^{\infty} e^{(t-\frac{1}{\mu})x} dx\\ $$
  • $t \geq \frac{1}{\mu}$は発散
  • $t < \frac{1}{\mu}$
\begin{eqnarray*} M_x(t) &=& \frac{1}{\mu} \int_0^{\infty} e^{-(\frac{1}{\mu}-t)x} dx\\ &=&\frac{1}{\mu}[-\frac{1}{\frac{1}{\mu}-t}e^{-(\frac{1}{\mu}-t)x}]_0^{\infty}\\ &=&\frac{1}{\mu}[0 + \frac{1}{\frac{1}{\mu}-t}]\\ &=&\frac{1}{\mu}\frac{1}{\frac{1}{\mu}-t}\\ &=&\frac{1}{1-\mu t}\\ \end{eqnarray*}

$X_1+X_2 + \cdots + X_n = Y_n, (X_i \sim Exp(\mu))$ のモーメント母関数は,

$$ M_{Y_n}(t) = E[e^{tY_n}] = E[e^{t \sum{X_i}}] = E[\Pi_i e^{t X_i}] = \Pi_i E[e^{t X_i}] = \frac{1}{(1-\mu t)^n} $$
In [62]:
n = 2
mu = 3
x = stats.expon(scale=mu).rvs(size=(10000, n))
Y = x.sum(axis=1)
plt.hist(Y, 100, normed=True,
         label="$X_1+X_2, X_i \sim Exp({})$".format(mu))

x = sp.linspace(0, 40, 100)
plt.plot(x, stats.gamma(a=n, scale=mu).pdf(x),
        label="$Ga({}, {})$".format(mu, n))

plt.legend()
plt.grid()
plt.show()

他分布との関係

分類 1回起こる(形状パラメータ$k=1$) n回起こる(形状パラメータ$k=n>1$)
離散 幾何分布 負の二項分布
連続 指数分布 ガンマ分布

$\chi^2$分布

  • 尺度母数$\mu=2, 形状母数n=\frac{n}{2}$ のとき,自由度$n$の$\chi^2$分布に従う.

確率密度関数を求めてみると,

$$ f(x;\mu=2, n=\frac{n}{2}) = \frac{x^{\frac{n}{2}-1}e^{-\frac{x}{2}}}{\Gamma(\frac{n}{2}) 2^{\frac{n}{2}}} $$

これは$\chi^2$分布の確率密度関数と一致する.

In [20]:
# 自由度n=3のchi^2分布
n = 3
x = sp.linspace(0, 20, 100)
chi2 = stats.chi2(df=n)
plt.plot(x, chi2.pdf(x), label="$\chi^2(n={})$".format(n))

# Ga(mu=2, n=3/2)
mu = 2
rvs = stats.gamma(a=n / 2., scale=mu).rvs(size=10000)
plt.hist(rvs, 100, normed=True, label="$Ga(\mu={}, n={})$".format(mu, n/2.))

plt.legend()
plt.grid()
plt.show()
  • ベイズ統計では

ガンマ分布は,正規分布の分散パラメータ$\sigma^2$の逆数である精度の共役事前分布になる.

よって,

$$ \text{事後分布:ガンマ分布}=\text{尤度関数:正規分布}\cdot \text{事前分布:ガンマ分布} $$

参考資料