We begin by importing some helper functions.
from helper import *
url = 'https://en.wikipedia.org/wiki/List_of_helicopter_prison_escapes'
Now, let's get the data from the [list of helicopter prison escapes] (https://en.wikipedia.org/wiki/List_of_helicopter_prison_escapes) Wikipedia article.
data = data_from_url(url)
for row in data[0:3]:
print(row)
['August 19, 1971', 'Santa Martha Acatitla', 'Mexico', 'Yes', 'Joel David Kaplan Carlos Antonio Contreras Castro', "Joel David Kaplan was a New York businessman who had been arrested for murder in 1962 in Mexico City and was incarcerated at the Santa Martha Acatitla prison in the Iztapalapa borough of Mexico City. Joel's sister, Judy Kaplan, arranged the means to help Kaplan escape, and on August 19, 1971, a helicopter landed in the prison yard. The guards mistakenly thought this was an official visit. In two minutes, Kaplan and his cellmate Carlos Antonio Contreras, a Venezuelan counterfeiter, were able to board the craft and were piloted away, before any shots were fired.[9] Both men were flown to Texas and then different planes flew Kaplan to California and Castro to Guatemala.[3] The Mexican government never initiated extradition proceedings against Kaplan.[9] The escape is told in a book, The 10-Second Jailbreak: The Helicopter Escape of Joel David Kaplan.[4] It also inspired the 1975 action movie Breakout, which starred Charles Bronson and Robert Duvall.[9]"] ['October 31, 1973', 'Mountjoy Jail', 'Ireland', 'Yes', "JB O'Hagan Seamus TwomeyKevin Mallon", 'On October 31, 1973 an IRA member hijacked a helicopter and forced the pilot to land in the exercise yard of Dublin\'s Mountjoy Jail\'s D Wing at 3:40\xa0p.m., October 31, 1973. Three members of the IRA were able to escape: JB O\'Hagan, Seamus Twomey and Kevin Mallon. Another prisoner who also was in the prison was quoted as saying, "One shamefaced screw apologised to the governor and said he thought it was the new Minister for Defence (Paddy Donegan) arriving. I told him it was our Minister of Defence leaving." The Mountjoy helicopter escape became Republican lore and was immortalized by "The Helicopter Song", which contains the lines "It\'s up like a bird and over the city. There\'s three men a\'missing I heard the warder say".[1]'] ['May 24, 1978', 'United States Penitentiary, Marion', 'United States', 'No', 'Garrett Brock TrapnellMartin Joseph McNallyJames Kenneth Johnson', "43-year-old Barbara Ann Oswald hijacked a Saint Louis-based charter helicopter and forced the pilot to land in the yard at USP Marion. While landing the aircraft, the pilot, Allen Barklage, who was a Vietnam War veteran, struggled with Oswald and managed to wrestle the gun away from her. Barklage then shot and killed Oswald, thwarting the escape.[10] A few months later Oswald's daughter hijacked TWA Flight 541 in an effort to free Trapnell."]
We initialize an index variable with the value of 0. The purpose of this variable is to help us track which row we're modifying.
index = 0
for row in data:
data[index] = row[:-1]
index += 1
print(data[:3])
[['August 19, 1971', 'Santa Martha Acatitla', 'Mexico', 'Yes', 'Joel David Kaplan Carlos Antonio Contreras Castro'], ['October 31, 1973', 'Mountjoy Jail', 'Ireland', 'Yes', "JB O'Hagan Seamus TwomeyKevin Mallon"], ['May 24, 1978', 'United States Penitentiary, Marion', 'United States', 'No', 'Garrett Brock TrapnellMartin Joseph McNallyJames Kenneth Johnson']]
In the code cell below, we iterate over data using the iterable variable row and: * With every occurrence of row[0], we refer to the first entry of row, i.e., the date. * Thus, with date = fetch_year(row[0]), we're extracting the year out of the date in row[0] and assiging it to the variable date. * We then replace the value of row[0] with the year that we just extracted.
for row in data:
row[0] = fetch_year(row[0])
print(data[:3])
[[1971, 'Santa Martha Acatitla', 'Mexico', 'Yes', 'Joel David Kaplan Carlos Antonio Contreras Castro'], [1973, 'Mountjoy Jail', 'Ireland', 'Yes', "JB O'Hagan Seamus TwomeyKevin Mallon"], [1978, 'United States Penitentiary, Marion', 'United States', 'No', 'Garrett Brock TrapnellMartin Joseph McNallyJames Kenneth Johnson']]
min_year = min(data, key=lambda x: x[0])[0]
max_year = max(data, key=lambda x: x[0])[0]
Before we move on, let's check what are the earliest and latest dates we have in our dataset.
print(min_year)
print(max_year)
1971 2020
Now we'll create a list of all the years ranging from min_year to max_year. Our goal is to then determine how many prison break attempts there were for each year. Since years in which there weren't any prison breaks aren't present in the dataset, this will make sure we capture them.
years = []
for y in range(min_year, max_year + 1):
years.append(y)
Let's take a look at years to see if it looks like we expected.
print(years)
[1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020]
Looks good!
Now we create a list where each element looks like [, 0].
attempts_per_year = []
for attempts in years:
attempts_per_year.append([attempts, 0])
print(attempts_per_year)
[[1971, 0], [1972, 0], [1973, 0], [1974, 0], [1975, 0], [1976, 0], [1977, 0], [1978, 0], [1979, 0], [1980, 0], [1981, 0], [1982, 0], [1983, 0], [1984, 0], [1985, 0], [1986, 0], [1987, 0], [1988, 0], [1989, 0], [1990, 0], [1991, 0], [1992, 0], [1993, 0], [1994, 0], [1995, 0], [1996, 0], [1997, 0], [1998, 0], [1999, 0], [2000, 0], [2001, 0], [2002, 0], [2003, 0], [2004, 0], [2005, 0], [2006, 0], [2007, 0], [2008, 0], [2009, 0], [2010, 0], [2011, 0], [2012, 0], [2013, 0], [2014, 0], [2015, 0], [2016, 0], [2017, 0], [2018, 0], [2019, 0], [2020, 0]]
And finally we increment the second entry (the one on index 1 which starts out as being 0) by 1 each time a year appears in the data.
for row in data:
# Instruction 1 - for each row in data
for ya in attempts_per_year: # Instruction 2 - nothing to do here
# Instruction 3 - assign the year value in ya to y
y = ya[0]
if row[0] == y:
ya[1] += 1
# Instruction 4 - print the results
print(attempts_per_year)
[[1971, 1], [1972, 0], [1973, 1], [1974, 0], [1975, 0], [1976, 0], [1977, 0], [1978, 1], [1979, 0], [1980, 0], [1981, 2], [1982, 0], [1983, 1], [1984, 0], [1985, 2], [1986, 3], [1987, 1], [1988, 1], [1989, 2], [1990, 1], [1991, 1], [1992, 2], [1993, 1], [1994, 0], [1995, 0], [1996, 1], [1997, 1], [1998, 0], [1999, 1], [2000, 2], [2001, 3], [2002, 2], [2003, 1], [2004, 0], [2005, 2], [2006, 1], [2007, 3], [2008, 0], [2009, 3], [2010, 1], [2011, 0], [2012, 1], [2013, 2], [2014, 1], [2015, 0], [2016, 1], [2017, 0], [2018, 1], [2019, 0], [2020, 1]]
%matplotlib inline
barplot(attempts_per_year)
countries_frequency = df["Country"].value_counts()
print_pretty_table(countries_frequency)
Country | Number of Occurrences |
---|---|
France | 15 |
United States | 8 |
Belgium | 4 |
Canada | 4 |
Greece | 4 |
United Kingdom | 2 |
Brazil | 2 |
Australia | 2 |
Chile | 1 |
Puerto Rico | 1 |
Russia | 1 |
Ireland | 1 |
Mexico | 1 |
Italy | 1 |
Netherlands | 1 |
countries = []
for country_a in data:
country = country_a[2]
if country not in countries:
countries.append(country)
print(countries)
['Mexico', 'Ireland', 'United States', 'France', 'Canada', 'Australia', 'Brazil', 'Italy', 'United Kingdom', 'Puerto Rico', 'Chile', 'Netherlands', 'Greece', 'Belgium', 'Russia']
prison_breaks_success = []
for rows in countries:
prison_breaks_success.append([rows, 0, 0])
for row in data:
for attempts in prison_breaks_success:
attempt = attempts[0]
if row[2] == attempt:
attempts[1] += 1
if row[3] == "Yes":
attempts[2] += 1
print(prison_breaks_success)
[['Mexico', 1, 1], ['Ireland', 1, 1], ['United States', 8, 6], ['France', 15, 11], ['Canada', 4, 3], ['Australia', 2, 1], ['Brazil', 2, 2], ['Italy', 1, 1], ['United Kingdom', 2, 1], ['Puerto Rico', 1, 1], ['Chile', 1, 1], ['Netherlands', 1, 0], ['Greece', 4, 2], ['Belgium', 4, 2], ['Russia', 1, 1]]
for row in prison_breaks_success:
success_percent = row[2] / row[1] * 100
row.append(success_percent)
print(prison_breaks_success)
[['Mexico', 1, 1, 100.0], ['Ireland', 1, 1, 100.0], ['United States', 8, 6, 75.0], ['France', 15, 11, 73.33333333333333], ['Canada', 4, 3, 75.0], ['Australia', 2, 1, 50.0], ['Brazil', 2, 2, 100.0], ['Italy', 1, 1, 100.0], ['United Kingdom', 2, 1, 50.0], ['Puerto Rico', 1, 1, 100.0], ['Chile', 1, 1, 100.0], ['Netherlands', 1, 0, 0.0], ['Greece', 4, 2, 50.0], ['Belgium', 4, 2, 50.0], ['Russia', 1, 1, 100.0]]
Considering that we have 15 attempts in France, I did not consider the analysis of countries that had less than 50% of escape attempts. In this way, I considered France and the United States, since the other countries had 4 or less escape attempts. Although in France we have almost twice as many attempts (15) compared to the United States (8), the percentage of escape success in the United States is slightly higher (United States 75% vs France 73.3%).
I understand that the database does not have enough information for us to have the best answer to this question. In this way, I can opine that rescuing a single escapee from a prison is easier than more escapees, as the escape plan for more escapees tends to be more complex.
escapees = []
for escapee_a in data:
escapee = escapee_a[4]
if escapee not in escapees:
escapees.append(escapee)
escapees_attempts = []
for attempts in escapees:
escapees_attempts.append([attempts, 0])
for row in data:
for attempts in escapees_attempts:
escapee = attempts[0]
if row[4] == escapee:
attempts[1] += 1
print(escapees_attempts)
[['Joel David Kaplan Carlos Antonio Contreras Castro', 1], ["JB O'Hagan Seamus TwomeyKevin Mallon", 1], ['Garrett Brock TrapnellMartin Joseph McNallyJames Kenneth Johnson', 1], ['Gérard DupréDaniel Beaumont', 1], ['Marina Paquet (hijacker)Giles Arseneault (prisoner)', 1], ['David McMillan', 1], ['James Rodney LeonardWilliam Douglas BallewJesse Glenn Smith', 1], ['José Carlos dos Reis Encina, a.k.a. "Escadinha"', 1], ['Michel Vaujour', 2], ['Samantha Lopez', 1], ['André BellaïcheGianluigi EspositoLuciano Cipollari', 1], ['Sydney DraperJohn Kendall', 1], ['Mahoney Danny Francis MitchellRandy Lackey', 1], ['Ben Kramer', 1], ['Ralph BrownFreddie Gonzales', 1], ['Robert FordDavid Thomas', 1], ['William Lane', 1], ['—', 7], ['Four members of the Manuel Rodriguez Patriotic Front', 1], ['John Killick', 1], ['Steven Whitsett', 1], ['Pascal Payet', 2], ['Abdelhamid CarnousEmile Forma-SariJean-Philippe Lecase', 1], ['Orlando Cartagena Jose Rodriguez Victor Diaz Hector Diaz Jose Tapia', 1], ['Eric AlboreoFranck PerlettoMichel Valero', 1], ['Hubert SellesJean-Claude MorettiMohamed Bessame', 1], ['Vassilis Paleokostas', 1], ['Eric Ferdinand', 1], ['Nordin Benallal', 1], ['Vasilis PaleokostasAlket Rizai', 1], ['Alexin JismyFabrice Michel', 1], ['Ashraf Sekkaki plus three other criminals', 1], ['Brian Lawrence', 1], ['Alexey Shestakov', 1], ['Panagiotis Vlastos', 1], ['Benjamin Hudon-BarbeauDanny Provençal', 1], ['Yves DenisDenis LefebvreSerge Pomerleau', 1], ['Pola RoupaNikos Maziotis', 1], ['Rédoine Faïd', 1], ['Kristel A.', 1]]
It would be necessary to process the data to isolate the fugitives, avoiding errors in the calculations. However, considering the above calculation, we have Pascal Payet with 2 escape attempts.