Given k people, what is the probability of at least 2 people having the same birthday?
First, we need to define the problem:
k≤1 is meaningless, so we will not consider those cases.
Now consider the case where you have more people than there are days in a year. In such a case,
P(k≥365)=1Now think about the event of no matches. We can compute this probability using the naïve definition of probability:
P(no match)=365×364×⋯×365−k+1365kNow the event of at least one match is the complement of no matches, so
P(at least one match)=1−P(no match)=1−365×364×⋯×365−k+1365kimport numpy as np
DAYS_IN_YEAR = 365
def bday_prob(k):
def no_match(k):
days = [(DAYS_IN_YEAR-x) for x in range(k)]
num = np.multiply.reduce(days, dtype=np.float64)
return num / DAYS_IN_YEAR**k
return 1.0 - no_match(k)
print("With k=23 people, the probability of a match is {:0f}, already exceeding 0.5.".format(bday_prob(23)))
print("With k=50 people, the probability of a match is {:0f}, which is approaching 1.0.".format(bday_prob(50)))
With k=23 people, the probability of a match is 0.507297, already exceeding 0.5. With k=50 people, the probability of a match is 0.970374, which is approaching 1.0.
Let's derive some properties using nothing but the two axioms stated earlier.
The probability of an event A is 1 minus the probability of that event's inverse (or complement).
P(Ac)=1−P(A)∵1=P(S)=P(A∪Ac)=P(A)+P(Ac)since A∩Ac=∅ ◼
If A is contained within B, then the probability of A must be less than or equal to that for B.
If A⊆B, then P(A)≤P(B)∵B=A∪(B∩Ac)P(B)=P(A)+P(B∩Ac)⟹P(B)≥P(A), since P(B∩Ac)≥0◼
The probability of a union of 2 events A and B
P(A∪B)=P(A)+P(B)−P(A∩B)since P(A∪B)=P(A∪(B∩Ac))=P(A)+P(B∩Ac)but note that P(B)=P(B∩A)+P(B∩Ac)and since P(B)−P(A∩B)=P(B∩Ac)⟹P(A∪B)=P(A)+P(B)−P(A∩B) ◼
This is the simplest case of the principle of inclusion/exclusion.
Considering the 3-event case, we have:
P(A∪B∪C)=P(A)+P(B)+P(C)−P(A∩B)−P(B∩C)−P(A∩C)+P(A∩B∩C)
...where we sum up all of the separate events; and then subtract each of the pair-wise intersections; and finally add back in that 3-event intersection since that was subtracted in the previous step.
For the general case, we have:
P(A1∪A2∪⋯∪An)=n∑j=1P(Aj)−∑i<jP(Ai∩Aj)+∑i<j<kP(Ai∩Aj∩Ak)⋯+(−1)n−1P(A1∩A2∩⋯∩An)... where we
Again from a gambling problem, say we have a deck of n cards, labeled 1 through n. The deck is thoroughly shuffled. The cards are then flipped over, one at a time. A win is when the card labeled k is the kth card flipped.
What is the probability of a win?
Let Ak be the event that card k is the kth card flipped. The probability of a win is when at least one of the n cards is in the correct position. Therefore, what we are interested in is
P(A1∪A2∪⋯∪An)Now, consider the following:
P(A1)=1nsince all outcomes are equally likelyP(A1∩A2)=1n(1n−1)=(n−2)!n!⋮P(A1∩A2∩⋯∩Ak)=(n−k)!n!because, symmetry
Which leads us to:
P(A1∪A2∪⋯∪Ak)=(n1)1n−(n2)1n(n−1)+(n3)1n(n−1)(n−2)−⋯=n1n−(n(n−1)2!)1n(n−1)+(n(n−1)(n−2)3!)1n(n−1)(n−2)−⋯=1−12!+13!−14!⋯(−1)n−11n!=1−∞∑k=1(−1)k−1k!=1−1e since e−1=(−1)00!+−11!+(−1)22!+(−1)33!+⋯+(−1)nn! from the Taylor expansion of exHere's a very nice, interactive explanation of the Birthday Paradox.
View on Lecture 3: Birthday Problem, Properties of Probability | Statistics 110 on YouTube.