A planned prison escape with the help of a helicopter would make up for a great movie script. But, wait, this has happened in the real life in several countries. Some have succeded in escaping and some have failed. So, the main goal of this project is to answer two question regarding helicopter escapes:
Now, lets get the data from the List of helicopter prison escapes Wikipedia article.
# Importing a few helper functions
from helper import *
url = 'https://en.wikipedia.org/wiki/List_of_helicopter_prison_escapes'
data = data_from_url(url)
# Printing the first three rows of the list of lists
for row in data[:3]:
print(row)
['August 19, 1971', 'Santa Martha Acatitla', 'Mexico', 'Yes', 'Joel David Kaplan Carlos Antonio Contreras Castro', "Joel David Kaplan was a New York businessman who had been arrested for murder in 1962 in Mexico City and was incarcerated at the Santa Martha Acatitla prison in the Iztapalapa borough of Mexico City. Joel's sister, Judy Kaplan, arranged the means to help Kaplan escape, and on August 19, 1971, a helicopter landed in the prison yard. The guards mistakenly thought this was an official visit. In two minutes, Kaplan and his cellmate Carlos Antonio Contreras, a Venezuelan counterfeiter, were able to board the craft and were piloted away, before any shots were fired.[9] Both men were flown to Texas and then different planes flew Kaplan to California and Castro to Guatemala.[3] The Mexican government never initiated extradition proceedings against Kaplan.[9] The escape is told in a book, The 10-Second Jailbreak: The Helicopter Escape of Joel David Kaplan.[4] It also inspired the 1975 action movie Breakout, which starred Charles Bronson and Robert Duvall.[9]"] ['October 31, 1973', 'Mountjoy Jail', 'Ireland', 'Yes', "JB O'Hagan Seamus TwomeyKevin Mallon", 'On October 31, 1973 an IRA member hijacked a helicopter and forced the pilot to land in the exercise yard of Dublin\'s Mountjoy Jail\'s D Wing at 3:40\xa0p.m., October 31, 1973. Three members of the IRA were able to escape: JB O\'Hagan, Seamus Twomey and Kevin Mallon. Another prisoner who also was in the prison was quoted as saying, "One shamefaced screw apologised to the governor and said he thought it was the new Minister for Defence (Paddy Donegan) arriving. I told him it was our Minister of Defence leaving." The Mountjoy helicopter escape became Republican lore and was immortalized by "The Helicopter Song", which contains the lines "It\'s up like a bird and over the city. There\'s three men a\'missing I heard the warder say".[1]'] ['May 24, 1978', 'United States Penitentiary, Marion', 'United States', 'No', 'Garrett Brock TrapnellMartin Joseph McNallyJames Kenneth Johnson', "43-year-old Barbara Ann Oswald hijacked a Saint Louis-based charter helicopter and forced the pilot to land in the yard at USP Marion. While landing the aircraft, the pilot, Allen Barklage, who was a Vietnam War veteran, struggled with Oswald and managed to wrestle the gun away from her. Barklage then shot and killed Oswald, thwarting the escape.[10] A few months later Oswald's daughter hijacked TWA Flight 541 in an effort to free Trapnell."]
Since, we only want to know the year in which most escapes happened and the country in which happened. The 'details' column for the analysis is pointless and makes the data a bit difficult to read. So, let's get rid of the details column.
# Modifying the data to remove last element of each row by using del statement
for row in data:
del row[-1]
# Printing the first few rows to confirm the deails column has been deleted
print(data[:3])
[['August 19, 1971', 'Santa Martha Acatitla', 'Mexico', 'Yes', 'Joel David Kaplan Carlos Antonio Contreras Castro'], ['October 31, 1973', 'Mountjoy Jail', 'Ireland', 'Yes', "JB O'Hagan Seamus TwomeyKevin Mallon"], ['May 24, 1978', 'United States Penitentiary, Marion', 'United States', 'No', 'Garrett Brock TrapnellMartin Joseph McNallyJames Kenneth Johnson']]
We can see that the first element of each list within the list are dates of the format Month/Day/Year. We need only the year. So, we are going to modify the date column to only include the year.
# Using the helper function, modifying the first element
for row in data:
row[0] = fetch_year(row[0])
#printing few rows to see if the function has been executed
print(data[:3])
[[1971, 'Santa Martha Acatitla', 'Mexico', 'Yes', 'Joel David Kaplan Carlos Antonio Contreras Castro'], [1973, 'Mountjoy Jail', 'Ireland', 'Yes', "JB O'Hagan Seamus TwomeyKevin Mallon"], [1978, 'United States Penitentiary, Marion', 'United States', 'No', 'Garrett Brock TrapnellMartin Joseph McNallyJames Kenneth Johnson']]
Now that we have only the year as the first element of each list, let's create a list of lists, where it will have two elements: A year and attempts occured in the corresponding year. We will cover year from the first registered attempt until the last
min_year = min(data, key=lambda x: x[0])[0]
max_year = max(data, key=lambda x: x[0])[0]
# Making a list of all the years from min_year to max_year
years = []
for y in range(min_year, max_year+1):
years.append(y)
# Having a look at the years list to see if it's the same as we expected
print(years)
[1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020]
Now that we have all the years in which a prison break has occured. Let's make a list of attempts per year, which will have the year and the number of occurences a escape has been made.
# Creating a list of attempts per year
attempts_per_year = []
for year in years:
attempts_per_year.append([year, 0])
Now, let's have a look at the attempts per year to see our result and make sure there are no repeated years.
attempts_per_year
[[1971, 0], [1972, 0], [1973, 0], [1974, 0], [1975, 0], [1976, 0], [1977, 0], [1978, 0], [1979, 0], [1980, 0], [1981, 0], [1982, 0], [1983, 0], [1984, 0], [1985, 0], [1986, 0], [1987, 0], [1988, 0], [1989, 0], [1990, 0], [1991, 0], [1992, 0], [1993, 0], [1994, 0], [1995, 0], [1996, 0], [1997, 0], [1998, 0], [1999, 0], [2000, 0], [2001, 0], [2002, 0], [2003, 0], [2004, 0], [2005, 0], [2006, 0], [2007, 0], [2008, 0], [2009, 0], [2010, 0], [2011, 0], [2012, 0], [2013, 0], [2014, 0], [2015, 0], [2016, 0], [2017, 0], [2018, 0], [2019, 0], [2020, 0]]
We have the list of year, number of occurences happened in that year. Now, we will be able to answer our first question: In which year did the most attempts at breaking out of prison with a helicopter occur.
# Using a nested loop to go through both data and attempts per year list, if the year accurs in data, the occurences in attempts_per_year will be incremented.
for row in data:
for ya in attempts_per_year:
if row[0] == ya[0]:
ya[1] += 1
print(attempts_per_year)
[[1971, 1], [1972, 0], [1973, 1], [1974, 0], [1975, 0], [1976, 0], [1977, 0], [1978, 1], [1979, 0], [1980, 0], [1981, 2], [1982, 0], [1983, 1], [1984, 0], [1985, 2], [1986, 3], [1987, 1], [1988, 1], [1989, 2], [1990, 1], [1991, 1], [1992, 2], [1993, 1], [1994, 0], [1995, 0], [1996, 1], [1997, 1], [1998, 0], [1999, 1], [2000, 2], [2001, 3], [2002, 2], [2003, 1], [2004, 0], [2005, 2], [2006, 1], [2007, 3], [2008, 0], [2009, 3], [2010, 1], [2011, 0], [2012, 1], [2013, 2], [2014, 1], [2015, 0], [2016, 1], [2017, 0], [2018, 1], [2019, 0], [2020, 1]]
We have got the list of occurences of helicopter escapes in the form of list. But, it's a bit tedious and doesn't convey the answer straightforward. Let's visualise, so that we can make sense of it in a more visual way.
%matplotlib inline
barplot(attempts_per_year)
Now that we have visulaised the data and can make some conclusions about it, we can answer our first question.
The year in which the most helicopter prison breaks happened are: 1986, 2001, 2007 and 2009.
Let's try to answer our second question now: In which countries do the most attempted helicopter prison escapes occur?
# Counting the number of occurences for each country
countries_frequency = df["Country"].value_counts()
# Using an helper function to display our table summarising helicopter escapes for each country.
print_pretty_table(countries_frequency)
Country | Number of Occurrences |
---|---|
France | 15 |
United States | 8 |
Canada | 4 |
Greece | 4 |
Belgium | 4 |
United Kingdom | 2 |
Australia | 2 |
Brazil | 2 |
Chile | 1 |
Mexico | 1 |
Netherlands | 1 |
Puerto Rico | 1 |
Italy | 1 |
Russia | 1 |
Ireland | 1 |
Turns out the country, France, has the most number of attempted helicopter prison escapes occur.
A few more questions we could answer by the help of this data can be: