For a good number of people, playing lottery often starts as a fun activity. Yet, for some people, this activity may grow into a habit that may eventually become an addiction.
As with other people who gamble compulsively, it is not unusual for lottery addicts to start to spend from their savings or loans, accumulate debts from borrowing or develop more desperate traits like theft.
A medical institute with the goal to prevent and treat these gambling addictions will like to build a mobile app that helps lottery addicts with better estimates of their winning chances.
While there are developers who are on standby to build the app, the institute needs us to create the logic of the app and calculate the probabilities.
The medical institute will like us to zero in on the 6/49 lottery and develop functions that wil help users answer these kind of questions:
For the purpose of this project, the institute will like us to consider historical data from the national 6/49 lottery game in Canada. This data set is available on Kaggle and contains data for 3,665 drawings made between 1982 and 2018.
Recall that the goal of this project is to write code that helps users answer probability questions on lottery playing. Well, that will mean we will need to calculate probabilites and combinations repeatedly and throughout the course of the project.
The 6/49 lottery works likes this: 6 numbers are drawn from 49 numbers ranging from 1 to 49. Each draw is done without a replacement. This means that a number cannot be put back into the set once it has been drawn.
That being said, let's start with writing 2 functions that we will be using often:
Here is the formula for calculating factorials:
To find the number of combinations in this scenario where we are taking k objects out of n objects, we will use the formula below:
Now that we've established the basics, let's write our two helper functions using the formulas above.
def factorial(n): #calculates factorials
final_product = 1
for i in range(n, 0, -1):
final_product *= i
return final_product
def combinations(n, k): #calculates combinations
numerator = factorial(n)
denominator = factorial(n-k)
return (numerator/denominator)/ factorial(k)
Recall that the first question we hope the app will helps users answer is 'What is the probability that I will win the big prize with just one ticket?'
Remember also that with the 6/49 lottery, a player chooses 6 out of 49 numbers for a single ticket.
So, the next step is to write a function that calculates the probability that a user will win the big prize for any ticket.
Meanwhile, based on one of our discussions with the team of developers, we will be considering the following details when writing the function:
def one_ticket_probability(your_six_numbers):
k = len(your_six_numbers)
n = 49
possible_outcomes = combinations(n,k)
successful_outcomes = 1
probability = successful_outcomes / possible_outcomes * 100
return print('''You have a {:.7f}% chance of winning the big prize with a single ticket when you use the numbers {}!
This means you have 1 in {:,} chances of winning.'''.format(probability, your_six_numbers, int(possible_outcomes)))
Let's test the function with a list of 6 numbers..
one_ticket_probability([1,3,4,6,49,8])
You have a 0.0000072% chance of winning the big prize with a single ticket when you use the numbers [1, 3, 4, 6, 49, 8]! This means you have 1 in 13,983,816 chances of winning.
In the previous step, we developed a function that helps users determine the probability of winning the big prize with just one ticket.
We also think users should be able to compare their ticket with historical data from the lottery in Canada. Doing this will help them know if they should have won by now.
Let's explore the Canada 6/49 lottery data...
import pandas as pd
canada_lottery = pd.read_csv("649.csv")
canada_lottery.shape
(3665, 11)
canada_lottery.head(3)
PRODUCT | DRAW NUMBER | SEQUENCE NUMBER | DRAW DATE | NUMBER DRAWN 1 | NUMBER DRAWN 2 | NUMBER DRAWN 3 | NUMBER DRAWN 4 | NUMBER DRAWN 5 | NUMBER DRAWN 6 | BONUS NUMBER | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 649 | 1 | 0 | 6/12/1982 | 3 | 11 | 12 | 14 | 41 | 43 | 13 |
1 | 649 | 2 | 0 | 6/19/1982 | 8 | 33 | 36 | 37 | 39 | 41 | 9 |
2 | 649 | 3 | 0 | 6/26/1982 | 1 | 6 | 23 | 24 | 27 | 39 | 34 |
canada_lottery.tail(3)
PRODUCT | DRAW NUMBER | SEQUENCE NUMBER | DRAW DATE | NUMBER DRAWN 1 | NUMBER DRAWN 2 | NUMBER DRAWN 3 | NUMBER DRAWN 4 | NUMBER DRAWN 5 | NUMBER DRAWN 6 | BONUS NUMBER | |
---|---|---|---|---|---|---|---|---|---|---|---|
3662 | 649 | 3589 | 0 | 6/13/2018 | 6 | 22 | 24 | 31 | 32 | 34 | 16 |
3663 | 649 | 3590 | 0 | 6/16/2018 | 2 | 15 | 21 | 31 | 38 | 49 | 8 |
3664 | 649 | 3591 | 0 | 6/20/2018 | 14 | 24 | 31 | 35 | 37 | 48 | 17 |
About the Data Set
In this section, we will write a function that helps users compare their ticket with historical data from the Canada lottery.
Here are a few things that the developers want us to consider when we write the function:
# a function that extracts all the winning six numbers from the canada_lottery data
def extract_numbers(row):
row = row[4:10]
row = set(row.values)
return row
# applying extract_numbers to canada_lottery
winning_numbers = canada_lottery.apply(extract_numbers, axis=1)
winning_numbers.head(5)
0 {3, 41, 11, 12, 43, 14} 1 {33, 36, 37, 39, 8, 41} 2 {1, 6, 39, 23, 24, 27} 3 {3, 9, 10, 43, 13, 20} 4 {34, 5, 14, 47, 21, 31} dtype: object
def check_historical_occurence(your_6_numbers, winning_numbers):
your_6_numbers = set(your_6_numbers)
occurrence = your_6_numbers == winning_numbers
frequency_of_occurrence = occurrence.sum()
if frequency_of_occurrence > 0:
print("This combination of 6 numbers has occurred {} time(s) in the past.".format(frequency_of_occurrence))
print("You have a 0.0000072% chance of winning the big prize in the next drawing when you use this combination of numbers.")
else:
print("This combination of 6 numbers has never occurred in the past.")
print('''But it doesn't mean it is likely to occur now.
You have a 0.0000072% chance of winning the big prize in the next drawing when you use this combination of numbers.''')
Let's test this function on a combination of 6 numbers...
testing_1 = [1,4,5,6,7,37]
check_historical_occurence(testing_1, winning_numbers)
This combination of 6 numbers has never occurred in the past. But it doesn't mean it is likely to occur now. You have a 0.0000072% chance of winning the big prize in the next drawing when you use this combination of numbers.
testing_2 = [6, 22, 24, 31, 32, 34]
check_historical_occurence(testing_2, winning_numbers)
This combination of 6 numbers has occurred 1 time(s) in the past. You have a 0.0000072% chance of winning the big prize in the next drawing when you use this combination of numbers.
So far, we've been able to build a function that calculates the probability of winning the big prize with just one ticket and another function that checks the occurence of a combination of numbers in the Canada lottery data set.
However, lottery addicts usually don't pay a single ticket. They often play multiple tickets because they think their chances of winning will increase significantly when they play more tickets.
In this section, we will help them with better estimates of their chances by writing a function that helps a user calculate their chances of winning with any number of tickets.
Here are a few important details we will be considering when we write the function:
Let's write the function...
def multi_ticket_probability(number_of_tickets):
tot_possible_outcomes = combinations(49, 6)
tot_successful_outcomes = number_of_tickets
probability = tot_successful_outcomes / tot_possible_outcomes * 100
combinations_rounded = round(tot_possible_outcomes / number_of_tickets)
print('''You have a {:.7f}% chance of winning the big prize when you play {} ticket(s).
This means you have 1 in {:,} chances of winning.'''.format(probability, number_of_tickets, combinations_rounded))
Let's test the function with the numbers in the list [1, 10, 100, 10000, 1000000, 6991908, 13983816]
for num_tickets in [1, 10, 100, 10000, 1000000, 6991908, 13983816]:
multi_ticket_probability(num_tickets)
print("========================") #separates each output
You have a 0.0000072% chance of winning the big prize when you play 1 ticket(s). This means you have 1 in 13,983,816 chances of winning. ======================== You have a 0.0000715% chance of winning the big prize when you play 10 ticket(s). This means you have 1 in 1,398,382 chances of winning. ======================== You have a 0.0007151% chance of winning the big prize when you play 100 ticket(s). This means you have 1 in 139,838 chances of winning. ======================== You have a 0.0715112% chance of winning the big prize when you play 10000 ticket(s). This means you have 1 in 1,398 chances of winning. ======================== You have a 7.1511238% chance of winning the big prize when you play 1000000 ticket(s). This means you have 1 in 14 chances of winning. ======================== You have a 50.0000000% chance of winning the big prize when you play 6991908 ticket(s). This means you have 1 in 2 chances of winning. ======================== You have a 100.0000000% chance of winning the big prize when you play 13983816 ticket(s). This means you have 1 in 1 chances of winning. ========================
In most 6/49 lotteries, players may win smaller prizes if their ticket matches two, three, four, or five of the six numbers they draw.
This means the user who uses the app may also want to know their chances of winning the smaller prizes.
To achieve this, we will write a function that calculates the probabilities of having exactly two, three, four, or five winning numbers.
Here are a few things we will consider while writing the code:
def probability_less_6(num_expected_winning_num):
number_of_combinations = combinations(6,num_expected_winning_num)
number_of_combinations_left = combinations(43, 6-num_expected_winning_num)
tot_successful_outcomes = number_of_combinations * number_of_combinations_left
tot_possible_outcomes = combinations (49, 6)
probability = tot_successful_outcomes / tot_possible_outcomes * 100
combination_rounded = round(tot_possible_outcomes/tot_successful_outcomes)
print('''You have a {:.7f}% chance of having exactly {} winning numbers with this ticket.
This means you have 1 in {} chances of winning.'''.format(probability, num_expected_winning_num, combination_rounded))
Let's test the function with all 4 possible inputs...
for winning_num in [2,3,4,5]:
probability_less_6(winning_num)
print("============================") #separates each output
You have a 13.2378029% chance of having exactly 2 winning numbers with this ticket. This means you have 1 in 8 chances of winning. ============================ You have a 1.7650404% chance of having exactly 3 winning numbers with this ticket. This means you have 1 in 57 chances of winning. ============================ You have a 0.0968620% chance of having exactly 4 winning numbers with this ticket. This means you have 1 in 1032 chances of winning. ============================ You have a 0.0018450% chance of having exactly 5 winning numbers with this ticket. This means you have 1 in 54201 chances of winning. ============================
Let's make some modifications to the probability_less_6()
function to calculate the probability of having at least 2, 3, 4, or 5 winning numbers.
For every inputted number n
, the new function will calculate the sum of the number of successful outcomes for having exactly n+1, n+2,...,6 winning numbers.
For instance, the number of successful outcomes for having at least 3 winning numbers will be the sum of:
def probability_at_least(n):
tot_successful_outcomes = 0
for i in range(n,7):
number_of_combinations = combinations(6,i)
number_of_combinations_left = combinations(43, 6-i)
successful_outcomes = number_of_combinations * number_of_combinations_left
tot_successful_outcomes = tot_successful_outcomes + successful_outcomes
tot_possible_outcomes = combinations (49, 6)
probability = tot_successful_outcomes / tot_possible_outcomes * 100
combination_rounded = round(tot_possible_outcomes/tot_successful_outcomes)
print('''You have a {:.7f}% chance of having at least {} winning numbers with this ticket.
This means you have 1 in {} chances of winning'''.format(probability, n, combination_rounded))
We will now test the probability_at_least()
function with all 4 possible inputs...
for winning_num in [2,3,4,5]:
probability_at_least(winning_num)
print("============================")
You have a 15.1015574% chance of having at least 2 winning numbers with this ticket. This means you have 1 in 7 chances of winning ============================ You have a 1.8637545% chance of having at least 3 winning numbers with this ticket. This means you have 1 in 54 chances of winning ============================ You have a 0.0987141% chance of having at least 4 winning numbers with this ticket. This means you have 1 in 1013 chances of winning ============================ You have a 0.0018521% chance of having at least 5 winning numbers with this ticket. This means you have 1 in 53992 chances of winning ============================
We started out in this project with the goal to write the logic for an app that provides lottery addicts with better estimates of their chances of winning the lottery.
To achieve this, we developed the following functions:
one_ticket_probability()
— to calculate the probability of winning the big prize with just one ticket.check_historical_occurrence()
— to check if a certain combination has occurred in the Canada lottery data set.multi_ticket_probability()
— to calculate the probability of winning the big prize with any number of tickets between 1 and 13,983,816.probability_less_6()
— to calculate the probability of having exactly two, three, four or five winning numbers to win smaller prizes.probability_at_least()
- to calculate the probability of having at least two, three, four or five winning numbers to win smaller prizes.Here are the questions we started with and the answers we got:
What is the probability that I will win the big prize with just one ticket?
From our analysis, you are over 400,000 times more likely to become a millionaire from making investments or running a business in America than you are to win the big prize with a single ticket (source).
What is the probability that I will win the big prize if I play multiple tickets?
The chance of winning the big prize increases with increasing number of tickets played. But the chance only increases significantly with a significant amount of tickets; which will cost you a fortune.
Given that a combination costs $3:
3 million dollars worth of tickets will only give you a 7.2 % chance.
You will need about 20 million dollars worth of tickets to get a 50% chance at winning.
What is the probability that I will win smaller prizes?
The probability of winning smaller prizes is relatively higher with a smaller number of expected winning numbers. You stand a better chance of having exactly 2 winning numbers (13.238%) than having exactly 5 winning numbers (0.002%).
What is the probability that I will have at least five winning numbers on just one winning ticket?
You have 1 in 53,992 chances of having at least 5 winning numbers on a ticket. This means you are 5 times more likely to win an Oscar award than you are to have at least 5 winning numbers on a 6/49 lottery ticket. So, enrolling in acting classes may be a better investment than buying lottery tickets (source).