Option 1: $x \in \{1,2,\ldots,K\}$.
Option 2: $x=(x_1,\ldots,x_K)^T$ with binary selection variables
where $m_k= \sum_n x_{nk}$ is the total number of occurrences that we 'threw' $k$ eyes. Note that $\sum_k m_k = N$.
we used a beta distribution that was conjugate with the binomial and forced us to choose prior pseudo-counts.
where $\Gamma(\dot)$ is the Gamma function. The Gamma function can be interpreted as a generalization of the factorial function (e.g., $3! = 3\cdot 2 \cdot 1$) to the real ($\mathbb{R}$) numbers.
where $m = (m_1,m_2,\ldots,m_K)^T$.
(You can find the mean of the Dirichlet distribution at its Wikipedia site).
This result is simply a generalization of Laplace's rule of succession.
where we used the fact that the maximum of the Dirichlet distribution $\mathrm{Dir}(\{\alpha_1,\ldots,\alpha_K\})$ is obtained at $(\alpha_k-1)/(\sum_k\alpha_k - K)$.
Of course, we shouldn't have to go through the full Bayesian framework to get the maximum likelihood estimate. Alternatively, we can find the maximum of the likelihood directly.
The log-likelihood for the multinomial distribution is given by
where we get $\lambda$ from the constraint $$\begin{equation*} \sum_k \hat \mu_k = \sum_k \frac{m_k} {\lambda} = \frac{N}{\lambda} \overset{!}{=} 1 \end{equation*}$$
Given $N$ IID observations $D=\{x_1,\dotsc,x_N\}$.
open("../../styles/aipstyle.html") do f
display("text/html", read(f, String))
end