Suppose $X$ and $Y$are two random variables with joint probability distribution $\Pr(X,Y)$. Then, the conditional probability of $X$ given $Y$ is given by Bayes theorem as
$$\Pr(X|Y) = \frac{\Pr(X,Y)}{\Pr(Y)} \tag{1}$$where $\Pr(Y)$ is the probability distribution of $Y$.
Similarly, thethe conditional probability of $Y$ given $X$ is
$$\Pr(Y|X)=\frac{\Pr(X,Y)}{\Pr(X)},$$which upon rearranging gives
$$\Pr(X,Y)=\Pr(Y|X)\Pr(X). \tag{2}$$Then, substituting (2) in (1) gives
$$\begin{eqnarray} \Pr(X|Y) &= &\frac{\Pr(X,Y)}{\Pr(Y)}\\ &= &\frac{\Pr(Y|X)\Pr(X)}{\Pr(Y)}, \end{eqnarray}$$which is the form of the formula that is used for inference of $X$ given $Y.$
Here we consider a simple example to justify the formula (1). The following table gives the joint distribution of smoking and lung cancer in a hypothetical population of 1,000,000 individuals.
$$ \begin{array}{c|lc|r} & \text{Smoking} \\ \hline \text{Cancer} & \text{Yes} & \text{No} & \text{} \\ \hline \text{Yes} & 42,500 & 7,500 & 50,000 \\ \text{No} & 207,500 & 742,500 & 950,000 \\ \hline & 250,000 & 750,000 & 1,000,000 \end{array} $$Suppose we sample with replacement individuals from the 250,000 smokers and compute the relative frequency of the incidence of lung cancer.
It can be shown that as the sample size goes to infinity, this relative frequency will approach $\frac{42,500}{250,000}=0.17$.
This ratio can also be written as $$\frac{42,500/1,000,000}{250,000/1,000,000}=0.17.$$
The ratio in the numerator is the joint probability of smoking and lung cancer, and the ratio in the denominator is the marginal probability of smoking.
So, $$\Pr(X|Y) = \frac{\Pr(X,Y)}{\Pr(Y)} $$
;ipython nbconvert --to slides BayesTheorem.ipynb
[NbConvertApp] Converting notebook BayesTheorem.ipynb to slides [NbConvertApp] Writing 197689 bytes to BayesTheorem.slides.html