#!/usr/bin/env python
# coding: utf-8

# ### Binomial is a specific type of a discrete probability distribution.

# Let's see an example question first, and then learn about the binomial distribution.

# Example 1: Two players are playing basketball, player A and player B. Player A takes an average of 11 shots per game, and 
# has an average success rate of 72%. Player B takes an average of 15 shots per game, but has an average success rate of 48%.
# 
# Question 1: What's the probability that Player A makes 6 shots in an average game?
# 
# Question 2: What's the probability that Player B makes 6 shots in an average game?

# We can classify this as a binomial experiment if the following conditions are met:
#     
#     1.) The process consists of a sequence of n trials.
#     2.) Only two exclusive outcomes are possible for each trial (A success and a failure)
#     3.) If the probability of a success is 'p' then the probability of failure is q=1-p
#     4.) The trials are independent.
#     
# 

# The formula for a Binomial Distribution Probability Mass Function turns out to be:    

# $$Pr(X=k)=C(n,k)p^k (1-p)^{n-k}$$

# Where n= number of trials,k=number of successes,p=probability of success,1-p=probability of failure (often written as q=1-p).

# This means that to get exactly 'k' successes in 'n' trials, we want exactly 'k' successes: $$p^k$$ 
# and we want 'n-k' failures:$$(1-p)^{n-k}$$
# Then finally, there are $$C(n,k)$$ ways of putting 'k' successes in 'n' trials.
# So we multiply all these together to get the probability of exactly that many success and failures in those n trials!

# --------------------------------------------------------------------------------------------------------------------------------

# Quick note, C(n,k) refers to the number of possible combinations of N things taken k at a time.
# 
# This is also equal to: $$C(n,k) =  \frac{n!}{k!(n-k)!}$$

# ### So let's try out the example problem!

# In[67]:


# Set up player A

# Probability of success for A
p_A = .72
# Number of shots for A
n_A = 11

# Make 6 shots
k = 6

# Now import scipy for combination
import scipy.misc as sc

# Set up C(n,k)
comb_A = sc.comb(n_A,k)

# Now put it together to get the probability!
answer_A = comb_A * (p_A**k) * ((1-p_A)**(n_A-k))

# Put the answer in percentage form!
answer_A = 100*answer_A


# Quickly repeat all steps for Player B
p_B = .48
n_B = 15
comb_B = sc.comb(n_B,k)
answer_B = 100 * comb_B * (p_B**k) * ((1-p_B)**(n_B-k))


#Print Answers
print ' The probability of player A making 6 shots in an average game is %1.1f%% ' %answer_A
print ' \n'
print ' The probability of player B making 6 shots in an average game is %1.1f%% ' %answer_B


# So now we know that even though player B is technically a worse shooter, because she takes more shots she will have a higher chance of making 6 shots in an average game!

# But wait a minute... what about a higher amount of shots, will player's A higher probability take a stronger effect then?
# What's the probability of making 9 shots a game for either player?

# In[31]:


#Let's find out!

#Set number of shots
k = 9

#Set new combinations
comb_A = sc.comb(n_A,k)
comb_B = sc.comb(n_B,k)

# Everything else remains the same
answer_A = 100 * comb_A * (p_A**k) * ((1-p_A)**(n_A-k))
answer_B = 100 * comb_B * (p_B**k) * ((1-p_B)**(n_B-k))

#Print Answers
print ' \n'
print ' The probability of player A making 9 shots in an average game is %1.1f%% ' %answer_A
print '\n'
print ' The probability of player B making 9 shots in an average game is %1.1f%% ' %answer_B
print '\n'


# Now we see that player's A ability level gives better odds of making exactly 9 shots. We need to keep in mind that we are asking
# about the probability of making *exactly* those amount of shots. This is a different question than " What's the probability that player A makes *at least* 9 shots?".

# #### Now let's investigate the mean and standard deviation for the binomial distribution

# The mean of a binomial distribution is simply: $$\mu=n*p$$

# This intuitively makes sense, the average number of successes should be the total trials multiplied by your average success rate.

# Similarly we can see that the standard deviation of a binomial is: $$\sigma=\sqrt{n*q*p}$$

# So now we can ask, whats the average number of shots each player will make in a game +/- a standard distribution?

# In[43]:


# Let's go ahead and plug in to the formulas.

# Get the mean
mu_A = n_A *p_A
mu_B = n_B *p_B

#Get the standard deviation
sigma_A = ( n_A *p_A*(1-p_A) )**0.5
sigma_B = ( n_B *p_B*(1-p_B) )**0.5

# Now print results
print '\n'
print 'Player A will make an average of %1.0f +/- %1.0f shots per game' %(mu_A,sigma_A)
print '\n'
print 'Player B will make an average of %1.0f +/- %1.0f shots per game' %(mu_B,sigma_B)
print '\n'
print "NOTE: It's impossible to make a decimal of a shot so '%1.0f' was used to replace the float!"


# ####Let's see how to automatically make a binomial distribution.

# In[45]:


from scipy.stats import binom

# We can get stats: Mean('m'), variance('v'), skew('s'), and/or kurtosis('k')
mean,var= binom.stats(n_A,p_A)

print mean
print var**0.5


# Looks like it matches up with our manual methods. Note: we did not round in this case.

# ####We can also get the probability mass function:

# Let's try another example to see the full PMF (Probability Mass Function) and plotting it.
# 
# Imagine you flip a fair coin. Your probability of getting a heads is p=0.5 (success in this example).
# 
# So what does your probability mass function look like for 10 coin flips?

# In[71]:


import numpy as np

# Set up a new example, let's say n= 10 coin flips and p=0.5 for a fair coin.
n=10
p=0.5

# Set up n success, remember indexing starts at 0, so use n+1
x = range(n+1)

# Now create the probability mass function
Y = binom.pmf(x,n,p)

#Show
Y

# Next we'll visualize the pmf by plotting it.


# ####Finally, let's plot the binomial distribution to get the full picture.

# In[77]:


import matplotlib.pyplot as plt

# For simple plots, matplotlib is fine, seaborn is unnecessary.

# Now simply use plot
plt.plot(x,Y,'o')

#Title (use y=1.08 to raise the long title a little more above the plot)
plt.title('Binomial Distribution PMF: 10 coin Flips, Odds of Success for Heads is p=0.5',y=1.08)

#Axis Titles
plt.xlabel('Number of Heads')
plt.ylabel('Probability')


# That's it for the review on Binomial Distributions. More info can be found at the following sources:
# 
# 1.) http://en.wikipedia.org/wiki/Binomial_distribution
# 
# 2.) http://stattrek.com/probability-distributions/binomial.aspx
# 
# 3.) http://mathworld.wolfram.com/BinomialDistribution.html

# Thanks!

# In[ ]: