Prison break! In movies and on television, one of the most exciting ways to escape from prison is by helicopter, but does it actually happen? According to this Wikipedia page, it does. But how common is it and in what countries does it happen the most? In the following analysis, we'll explore the number of attempts made per year since 1971 as well as which countries have the most attempts.
We begin by importing some helper functions from the file helper.py. The functions we will be using include:
data_from_url(url)
: downloads prison data from Wikipedia into a list of listsfetch_year(date_string)
: returns the year from the date string in the first cell of a rowbarplot(list_of_2_element_list)
: generates a barplot from a list of date/count listsunique_countries(countries)
: returns a list of unique country names in the dataprint_pretty_table(countries_frequency)
: displays a formatted table from a dataframe of country names and escape countsfrom helper import *
Now, let's get the data from the Wikipedia page using the helper function data_from_url()
.
url = 'https://en.wikipedia.org/wiki/List_of_helicopter_prison_escapes'
data = data_from_url(url)
And print the first three rows to verify the data downloaded properly and get an idea what it looks like.
for i in data[:3]:
print(i,"\n") #Add a newline for better readability
['August 19, 1971', 'Santa Martha Acatitla', 'Mexico', 'Yes', 'Joel David Kaplan Carlos Antonio Contreras Castro', "Joel David Kaplan was a New York businessman who had been arrested for murder in 1962 in Mexico City and was incarcerated at the Santa Martha Acatitla prison in the Iztapalapa borough of Mexico City. Joel's sister, Judy Kaplan, arranged the means to help Kaplan escape, and on August 19, 1971, a helicopter landed in the prison yard. The guards mistakenly thought this was an official visit. In two minutes, Kaplan and his cellmate Carlos Antonio Contreras, a Venezuelan counterfeiter, were able to board the craft and were piloted away, before any shots were fired.[9] Both men were flown to Texas and then different planes flew Kaplan to California and Castro to Guatemala.[3] The Mexican government never initiated extradition proceedings against Kaplan.[9] The escape is told in a book, The 10-Second Jailbreak: The Helicopter Escape of Joel David Kaplan.[4] It also inspired the 1975 action movie Breakout, which starred Charles Bronson and Robert Duvall.[9]"] ['October 31, 1973', 'Mountjoy Jail', 'Ireland', 'Yes', "JB O'Hagan Seamus TwomeyKevin Mallon", 'On October 31, 1973 an IRA member hijacked a helicopter and forced the pilot to land in the exercise yard of Dublin\'s Mountjoy Jail\'s D Wing at 3:40\xa0p.m., October 31, 1973. Three members of the IRA were able to escape: JB O\'Hagan, Seamus Twomey and Kevin Mallon. Another prisoner who also was in the prison was quoted as saying, "One shamefaced screw apologised to the governor and said he thought it was the new Minister for Defence (Paddy Donegan) arriving. I told him it was our Minister of Defence leaving." The Mountjoy helicopter escape became Republican lore and was immortalized by "The Helicopter Song", which contains the lines "It\'s up like a bird and over the city. There\'s three men a\'missing I heard the warder say".[1]'] ['May 24, 1978', 'United States Penitentiary, Marion', 'United States', 'No', 'Garrett Brock TrapnellMartin Joseph McNallyJames Kenneth Johnson', "43-year-old Barbara Ann Oswald hijacked a Saint Louis-based charter helicopter and forced the pilot to land in the yard at USP Marion. While landing the aircraft, the pilot, Allen Barklage, who was a Vietnam War veteran, struggled with Oswald and managed to wrestle the gun away from her. Barklage then shot and killed Oswald, thwarting the escape.[10] A few months later Oswald's daughter hijacked TWA Flight 541 in an effort to free Trapnell."]
As we can see, each row contains 6 fields. The last one, "Details," takes up a significant amount of space. As we won't be using it for this analysis, we'll remove it by iterating over each row and removing the last element.
index = 0
for row in data:
data [index] = row[0:-1]
index +=1
for row in data[:3]:
print(row)
['August 19, 1971', 'Santa Martha Acatitla', 'Mexico', 'Yes', 'Joel David Kaplan Carlos Antonio Contreras Castro'] ['October 31, 1973', 'Mountjoy Jail', 'Ireland', 'Yes', "JB O'Hagan Seamus TwomeyKevin Mallon"] ['May 24, 1978', 'United States Penitentiary, Marion', 'United States', 'No', 'Garrett Brock TrapnellMartin Joseph McNallyJames Kenneth Johnson']
Our first analysis task is to find the number of escapes per year. Currently the date contains the month and day, but we only need the year. Our helper function fetch_year()
can help us with this. It extracts the date from the end of the string.
for row in data:
row[0] = fetch_year(row[0])
for row in data[:3]:
print(row)
[1971, 'Santa Martha Acatitla', 'Mexico', 'Yes', 'Joel David Kaplan Carlos Antonio Contreras Castro'] [1973, 'Mountjoy Jail', 'Ireland', 'Yes', "JB O'Hagan Seamus TwomeyKevin Mallon"] [1978, 'United States Penitentiary, Marion', 'United States', 'No', 'Garrett Brock TrapnellMartin Joseph McNallyJames Kenneth Johnson']
Now that we've extracted the year, let's analyze the number of attempts per year. To do this, we'll count the number of attempts from each year from the earliest attempt to the latest. First we'll need to determine the first and last years using the min()
and max()
functions.
### Find earliest and latest year in the dataset
min_year = min(data, key=lambda x: x[0])[0]
max_year = max(data, key=lambda x: x[0])[0]
### Create a list of all years (to make sure we have the years when there were no attempts)
years = []
for y in range(min_year, max_year + 1):
years.append(y)
print(years)
[1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020]
For a counter, we'll create a new list, attempts_per_year
, which will consist of [year, count] pairs, setting each count to zero.
attempts_per_year = []
for year in years:
attempts_per_year.append([year,0])
print(attempts_per_year)
[[1971, 0], [1972, 0], [1973, 0], [1974, 0], [1975, 0], [1976, 0], [1977, 0], [1978, 0], [1979, 0], [1980, 0], [1981, 0], [1982, 0], [1983, 0], [1984, 0], [1985, 0], [1986, 0], [1987, 0], [1988, 0], [1989, 0], [1990, 0], [1991, 0], [1992, 0], [1993, 0], [1994, 0], [1995, 0], [1996, 0], [1997, 0], [1998, 0], [1999, 0], [2000, 0], [2001, 0], [2002, 0], [2003, 0], [2004, 0], [2005, 0], [2006, 0], [2007, 0], [2008, 0], [2009, 0], [2010, 0], [2011, 0], [2012, 0], [2013, 0], [2014, 0], [2015, 0], [2016, 0], [2017, 0], [2018, 0], [2019, 0], [2020, 0]]
To count the number of attempts per year, first we'll interate over each year in the list of years. For each year, we will loop through the data one row at a time. If the year in the data row matches the year in attempts_per_year
, increment the count by one.
for row in data:
for ya in attempts_per_year:
y = ya[0]
if row[0] == y:
ya[1] += 1
#check the results
print(attempts_per_year)
[[1971, 1], [1972, 0], [1973, 1], [1974, 0], [1975, 0], [1976, 0], [1977, 0], [1978, 1], [1979, 0], [1980, 0], [1981, 2], [1982, 0], [1983, 1], [1984, 0], [1985, 2], [1986, 3], [1987, 1], [1988, 1], [1989, 2], [1990, 1], [1991, 1], [1992, 2], [1993, 1], [1994, 0], [1995, 0], [1996, 1], [1997, 1], [1998, 0], [1999, 1], [2000, 2], [2001, 3], [2002, 2], [2003, 1], [2004, 0], [2005, 2], [2006, 1], [2007, 3], [2008, 0], [2009, 3], [2010, 1], [2011, 0], [2012, 1], [2013, 2], [2014, 1], [2015, 0], [2016, 1], [2017, 0], [2018, 1], [2019, 0], [2020, 1]]
%matplotlib inline
barplot(attempts_per_year)
Based on the above chart, it appears the years with the most attempts are a tie between 1986, 2001, 2007, and 2009, all with 3 attempts. While most years have at least one attempt, there are many with zero attempts.
Our next task is to determine what countries have had the most escape attempts. This time we'll use a dataframe df
, which was already created using code from the helper.py file. We can use Series.value_counts() to find the number of attempts per country. Since the output of that method is plain, we'll use our helper function print_pretty_table()
to make it look better
countries_frequency = df["Country"].value_counts()
print_pretty_table(countries_frequency)
Country | Number of Occurrences |
---|---|
France | 15 |
United States | 8 |
Canada | 4 |
Greece | 4 |
Belgium | 4 |
Australia | 2 |
Brazil | 2 |
United Kingdom | 2 |
Chile | 1 |
Italy | 1 |
Mexico | 1 |
Puerto Rico | 1 |
Netherlands | 1 |
Russia | 1 |
Ireland | 1 |
France is the clear leader with 15 attempts, followed by the US with 8. No other country in the list has more than 4 attempts.
Although helicopter prison escapes are not common, it does happen occasionally. According to Wikipedia, there have been over 40 attampts between 1971 and 2020, with anywhere from zero to three attempts per year. While France and the US have seen the most attempts, a smattering of other countries have had at least one in the past 50 years.