What Are Your Chances Of Winning The Lottery?¶

Introduction:¶

It is pretty common, we have some extra change and we don't know what to do with it and so we decide to try our chance at the lottery. After all we've seen the big winners on TV, people who became millionaires from nothing just by selecting the right numbers on their tickets. It's supposed to be a one time thing, we lost at the first try but we are determinded to hit the jackpot and so we try again, again, and again. It has become a habit, before we know it, we can't stop ourselves from playing, it's now become an addiction and each day we keep pouring more and more money into it. The big question is, are we still going to play if we knew our chances of winning?

In this project we are going to be simulating a real world scenario to answer the following questions.

What is the probability of winning the big prize with a single ticket?
What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?

The dataset we will be working with is a kaggle dataset on popular Canadian lotto 6/49. Lotto 6/49 is one of three national lottery games in Canada. Launched on June 12, 1982, Lotto 6/49 was the first nationwide Canadian lottery game to allow players to choose their own numbers. Previous national games, such as the Olympic Lottery, Loto Canada and Superloto used pre-printed numbers on tickets. Lotto 6/49 led to the gradual phase-out of that type of lottery game in Canada.

Winning numbers are drawn by the Interprovincial Lottery Corporation every Wednesday and Saturday, executed with a Smartplay Halogen II ball machine.

Summary Conclusion:¶

It isn't worth it playing the lottery. The chances of having a winning ticket or even at least 2 winning numbers on a ticket are so small.

Probability Of Winning The Lotto¶

we are going to be creating 3 functions.

A function that calculates the factorial of any number.
A function that calculates the combination of any numbers n and k.
A fuction to calculate the probability that a lotto ticket is winning.

In [1]:

def factorial(n):
    ## calculates the factorial of any number n
    factorial = 1
    for i in range(n, 0, -1):
        factorial *= i
    return factorial


def combination(n, k):
    ## returns the combination of two sets of numbers n and k
    numerator = factorial(n)
    denominator = factorial(k) * factorial(n - k)
    return int(numerator/denominator)

#testing our combination function
combination(5, 3)

Out[1]:

In [2]:

def one_ticket_probability(ticket):
    ## calculates the probability of winning the lottery for any ticket
    c = combination(49, 6)
    outcome = 1
    probability = outcome/c 
    probability_percentage = probability * 100 #turns the probability to a percentage
    print(
'''You have a 1 in {:,} or {:.7f}% chance of winning with {}.
'''.format(c, probability_percentage, ticket)
    )

#Testing the function
ticket1 = [11, 2, 4, 5, 6, 10]
ticket2 = [4, 11, 12, 1, 13, 3]
one_ticket_probability(ticket1)
print('\n')
one_ticket_probability(ticket2)
    

You have a 1 in 13,983,816 or 0.0000072% chance of winning with [11, 2, 4, 5, 6, 10].



You have a 1 in 13,983,816 or 0.0000072% chance of winning with [4, 11, 12, 1, 13, 3].

To add some context to what this function does , there are 13,983,816 possible outcomes if you were to pick 6 numbers out of 49 numbers. Since only one combination of the number wins the lottery, the number of winning outcome is just one. So to get the probability of winning the lottery, we divide the number of wining outcomes by the number of possible out comes. That's why we divided 1 (the number of winning outcome) by 13,983,816 (the number of possible outcomes).

Reading In The Dataset Of Historically Winning Tickets¶

This dataset contains 3,665 rows, with each row representing winning tickets from 1982 to 2018 and can be found on kaggle.

In [3]:

import pandas as pd
lotto_649 = pd.read_csv('649.csv')

In [4]:

lotto_649.shape #shows the number of rows and columns of the dataset

Out[4]:

(3665, 11)

In [5]:

lotto_649.head() # displays the first 5 rows

Out[5]:

	PRODUCT	DRAW NUMBER	DRAW DATE	NUMBER DRAWN 1	NUMBER DRAWN 2	NUMBER DRAWN 3	NUMBER DRAWN 4	NUMBER DRAWN 5	NUMBER DRAWN 6	BONUS NUMBER
0	649	1	6/12/1982	3	11	12	14	41	43	13
1	649	2	6/19/1982	8	33	36	37	39	41	9
2	649	3	6/26/1982	1	6	23	24	27	39	34
3	649	4	7/3/1982	3	9	10	13	20	43	34
4	649	5	7/10/1982	5	14	21	31	34	47	45

In [6]:

lotto_649.tail() #displays the last 5 rows

Out[6]:

	PRODUCT	DRAW NUMBER	DRAW DATE	NUMBER DRAWN 1	NUMBER DRAWN 2	NUMBER DRAWN 3	NUMBER DRAWN 4	NUMBER DRAWN 5	NUMBER DRAWN 6	BONUS NUMBER
3660	649	3587	6/6/2018	10	15	23	38	40	41	35
3661	649	3588	6/9/2018	19	25	31	36	46	47	26
3662	649	3589	6/13/2018	6	22	24	31	32	34	16
3663	649	3590	6/16/2018	2	15	21	31	38	49	8
3664	649	3591	6/20/2018	14	24	31	35	37	48	17

Comparing Tickets With Historically Winning Ones.¶

We are going to create a function takes in a list of 6 numbers from 1 to 49 and should :

return the number of times the combination of thoses 6 numbers have occured in our historically winning dataset.
return the probability of winning the big prize in the next draw with that combination.

In [7]:

def extract_numbers(row):
    # converts the values in the selected columns to a set.
    row = row[['NUMBER DRAWN 1', 'NUMBER DRAWN 2', 'NUMBER DRAWN 3',
                'NUMBER DRAWN 4', 'NUMBER DRAWN 5', 'NUMBER DRAWN 6']]
    row = set(row.values)
    return row

winning_numbers = lotto_649.apply(extract_numbers, axis=1)
winning_numbers.head()

Out[7]:

0    {3, 41, 11, 12, 43, 14}
1    {33, 36, 37, 39, 8, 41}
2     {1, 6, 39, 23, 24, 27}
3     {3, 9, 10, 43, 13, 20}
4    {34, 5, 14, 47, 21, 31}
dtype: object

In [8]:

def check_historical_occurence(user_no, winning_no):
    # Compares the user input(python list) with the winning_numbers(pandas series)
    # calculates the number of times the users input has occured before in the past using the winning_numbers as reference
    user_no = set(user_no)
    result = user_no == winning_no
    historical_occurence = result.sum()
    
    if historical_occurence == 0:
        print("""
The combination {} has never occured before in the past.
However your chances to win the next draw with {}  is 1 in 13,983,816 or 0.0000072%.
""".format(user_no, user_no)
             )
    else:
        print(
"""The combination {} has occured {} times in the past.
However this doesn't guarantee that you will win the next draw.
You have a 1 in 13,983,816 or 0.0000072% chance of winning with {}.
""".format(user_no, historical_occurence, user_no)
        )
    
    

In [9]:

#testing the check_historical_occurence function
test_ticket1 = [3, 41, 11, 12, 43, 14]
test_ticket2 = [11, 2, 4, 5, 6, 10]
test1 = check_historical_occurence(test_ticket1, winning_numbers)
print('\n')
test2 = check_historical_occurence(test_ticket2, winning_numbers)

The combination {3, 41, 11, 12, 43, 14} has occured 1 times in the past.
However this doesn't guarantee that you will win the next draw.
You have a 1 in 13,983,816 or 0.0000072% chance of winning with {3, 41, 11, 12, 43, 14}.




The combination {2, 4, 5, 6, 10, 11} has never occured before in the past.
However your chances to win the next draw with {2, 4, 5, 6, 10, 11}  is 1 in 13,983,816 or 0.0000072%.

to successful compare a ticket with one that is historically winning, we had to write two functions. Our first function extract_numbers() was one that extracts the set of numbers in winning tickets from the lotto_649 DataFrame . This is important because it allows us to compare the winning tickets with the ticket that a user is going to input. Our second function check_historical_occurence() takes in the user's input, and a pandas Series containing all sets of winning numbers. It turns the user's input into a set and compares it to the sets in our series. If it matches any of the sets in the series, it prints the number of times there was a match and also the probability of winning the big prize with that set of numbers.

Probability Of Winning With Miltiple Tickets.¶

We are writting a function to calculate the probability of winning with any number of tickets between 1 to 13,983,816.

In [10]:

def multi_ticket_probability(n_tickets):
    # gives the probability of winning with n number of tickets
    possible_outcomes = combination(49, 6)
    winning_outcomes = n_tickets
    probability = winning_outcomes / possible_outcomes
    probability_percentage = probability * 100 # turns the probability to a percentage
    if n_tickets == 1:
        print(
'''You have a 1 in {:,} or a {:.7f}% chance of winning
if you play with {:,} ticket.
'''.format(possible_outcomes, probability_percentage, n_tickets)
        )

    else:
        new_possible_outcomes = int(possible_outcomes/winning_outcomes)
        print(
'''You have a 1 in {:,} or a {:.7f}% chance of winning
if you play with {:,} tickets.
'''.format(new_possible_outcomes, probability_percentage, n_tickets)
        )
  

In [11]:

# testing the multi_ticket_probability function
n_tickets = [1, 10, 100, 10000, 1000000, 6991908, 13983816]

for i in n_tickets:
    multi_ticket_probability(i)
    print('\n') # prints a new line after each iteration

You have a 1 in 13,983,816 or a 0.0000072% chance of winning
if you play with 1 ticket.



You have a 1 in 1,398,381 or a 0.0000715% chance of winning
if you play with 10 tickets.



You have a 1 in 139,838 or a 0.0007151% chance of winning
if you play with 100 tickets.



You have a 1 in 1,398 or a 0.0715112% chance of winning
if you play with 10,000 tickets.



You have a 1 in 13 or a 7.1511238% chance of winning
if you play with 1,000,000 tickets.



You have a 1 in 2 or a 50.0000000% chance of winning
if you play with 6,991,908 tickets.



You have a 1 in 1 or a 100.0000000% chance of winning
if you play with 13,983,816 tickets.

People try to increase their chances of winning by playing multiple tickets. While it is true that playing multiple tickets increase your chances of winning, you need to play a ridiculously high amount of tickets to get any significant chance of winning. To get just a 7% chance of winning, you need to play 1,000,000 tickets.

The above function takes in any number for number of tickets played and then tells you the probability of winning if you play that number of tickets. To achieve this we did the following?

compute the number of possible outcomes which is combination of 49 and 6.
Calculate the percentage of winning. This was done by finding the probability of winning which is dividing the number of tickets played by the number of possible outcomes, we multiplied the result by 100 to get the percentage of winning.

Probability Of Having 2 to 5 Winning Numbers On A Ticket.¶

This function takes in an integer(n) between 2 and 5 and computes the probability of having n winning numbers on a ticket.

In [12]:

def probability_less_6(n):
    ## finds the probability of having 2 to 5 winning numbers in a ticket.
    possible_outcomes = combination(49, 6)
    successful_outcomes = {}
    
    # calculates the total combinations of numbers between 2 and 5 if chosen from a set of 6 numbers
    for i in range(2,6): 
        c = combination(6, i) 
        remainder = combination(43, 6 - i)
    # the possible outcomes of picking any winning number between 2 and 5 from the 43 remaining numbers.
    
        outcome = c * remainder
        successful_outcomes[i] = outcome
    
    #calculating the probability of having n  winning numbers in a ticket
    if n in successful_outcomes:
        successful_outcome = successful_outcomes[n]
    
    probability = successful_outcome / possible_outcomes
    probability_percentage = probability  * 100 #multiplying by 100 converts to a percentage
    new_possible_outcome = int(possible_outcomes / successful_outcome)
    
    print('''
You have a 1 in {:,} chance, 
or a {:.4f}% chance of having {} winning numbers on a ticket.
'''.format(new_possible_outcome, probability_percentage, n)
         )

In [13]:

# testing the probability_less_6 function
n_winning = [2, 3, 4, 5]
for i in n_winning:
    probability_less_6(i)
    print('\n') # prints a newline after each iteration

You have a 1 in 7 chance, 
or a 13.2378% chance of having 2 winning numbers on a ticket.




You have a 1 in 56 chance, 
or a 1.7650% chance of having 3 winning numbers on a ticket.




You have a 1 in 1,032 chance, 
or a 0.0969% chance of having 4 winning numbers on a ticket.




You have a 1 in 54,200 chance, 
or a 0.0018% chance of having 5 winning numbers on a ticket.

Above we created a function that calculates the probability of having between 2 to 5 winning numbers on a ticket.

First we had to find the number of winning outcomes of having between 2 winning numbers out of 6 to 5 winning numbers out of 6 numbers by taking the combination of the numbers, e.g: for having 5 winning numbers we compute the value of 6 combination 5.
we divided the number of winning outcomes by the total number of possible outcomes and then mulitplied by 100 to get the percentage.

Of course everyone wants to win the big prize and people continue to play even when they know the chance of winning the big prize is really slim, one reason is because they feel having at least 2 winning numbers is some kind of compensation if they miss out on winning the big prize. But from what we can see above, there's only a meagre 13.24% chance of having 2 winning numbers on a ticket.

Probability of Having At Least n Winning Numbers.¶

We are going to create a function that calculates the probability of having at least 2 to 5 winning tickets.

In [14]:

def probability_at_least_n(n):
    ## finds the probability of having at least 2 to 5 winning numbers in a ticket.
    possible_outcomes = combination(49, 6)
    probability_6_percentage = 1 / possible_outcomes * 100
    probabilities = {}
    probabilities[6] =  probability_6_percentage #adds the probility of winning to the probabilities dictionary.
    
    # calculates the total combinations of numbers between 2 and 5 if chosen from a set of 6 numbers
    for i in range(2,6): 
        c = combination(6, i) 
        remainder = combination(43, 6 - i)
    # the possible outcomes of picking any winning number between 2 and 5 from the 43 remaining numbers.
    
        outcome = c * remainder
        
        probability = outcome / possible_outcomes
        probability_percentage = probability  * 100 #multiplying by 100 converts to a percentage
        
        probabilities[i] = probability_percentage # adds the probabilities of the numbers between 2 and 5 to the probabilities dictionary
    
    
    #calculating the probability of having at least n winning numbers in a ticket
    if n in probabilities:
        total_probability = 0
        for i in range(n, 7):
            total_probability += probabilities[i]
    
    print('''
You have a {:.6f}% chance of having at least {} winning numbers 
on a ticket.
    '''.format(total_probability, n)
         )
    
    

In [15]:

# testing the probability_at_least_n function
for i in n_winning: 
    probability_at_least_n(i)
    print('\n') # prints a newline after each iteration

You have a 15.101557% chance of having at least 2 winning numbers 
on a ticket.
    



You have a 1.863755% chance of having at least 3 winning numbers 
on a ticket.
    



You have a 0.098714% chance of having at least 4 winning numbers 
on a ticket.
    



You have a 0.001852% chance of having at least 5 winning numbers 
on a ticket.

We created a function that calculates the probability of having at least n winning number between 2 and 6. To achieve this we:

calculated the probabilities for every number of tickets between 2 and 6.
We summed the probabilities of n and the probabilities of the numbers above n. E.g, If n = 3, we are going to sum the probilities of having 3, 4, 5, and 6 winning numbers. This gives the probability of having at least 3 winning numbers.

Conclusion¶

We asked a very important question at the beginning of this project.

What is the probability of winning the big prize with a single ticket?
What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?

We've so far been able to find answers for these questions.

There is 1 in 13,983,816 or a 0.000072% chance of winning with a single ticket.
There is a 1 in 13 or a 7.2% chance of winning with a 1,000,000 tickets.
There is a 15.1% chance of having at least 2 winning numbers on a single ticket.