There have been multiple prison escapes where an inmate escapes by means of a helicopter. One of the earliest instances was the escape of Joel David Kaplan, nicknamed "Man Fan", on August 19, 1971, from the Santa Martha Acatitla in Mexico.
The following questions are to be answered
1: In which year did the most helicopter prison break attempts occur?
2: In which countries do the most attempted helicopter prison breaks occur?
3: In which countries do helicopter prison breaks have a higher chance of success?
4: How does the number of escapees affect the success?
5: Which escapees have done it more than once?
To answer the above questions, data from this Wikipedia page was scraped and analyzed. It covers the details of attempted helicopter prison escapes over a period of 50 years (1971-2020)
Exploring the dataset
The dataset contains six data fields:
1- Date: the date of the attempted prison break
2- Prison name: the name of the prison
3- Country: where it the prison break happened
4- Status: Whether the attempt was successful or not
5- Names: of the escapees
6- More details: other details of the prison break
# import the needed helper function
from helper import *
# get the needed data from the data source
url = "https://en.wikipedia.org/wiki/List_of_helicopter_prison_escapes"
data = data_from_url(url)
# Loop through the data to print the first 3 records
for row in data:
print(row[:3])
['August 19, 1971', 'Santa Martha Acatitla', 'Mexico'] ['October 31, 1973', 'Mountjoy Jail', 'Ireland'] ['May 24, 1978', 'United States Penitentiary, Marion', 'United States'] ['February 27, 1981', 'Fleury-Mérogis, Essonne, Ile de France', 'France'] ['May 7, 1981', 'Orsainville Prison, Quebec City', 'Canada'] ['January, 1983', 'Pentridge (HM Prison)', 'Australia'] ['December 19, 1985', 'Perry Correctional Institution, Pelzer, South Carolina', 'United States'] ['December 31, 1985', 'Cândido Mendes penitentiary, Ilha Grande, Rio de Janeiro', 'Brazil'] ['May 26, 1986', 'Prison de la Santé', 'France'] ['November 5, 1986', 'Federal Correctional Institution, Dublin', 'United States'] ['November 23, 1986', 'Prigione di Rebibbia, Roma', 'Italy'] ['December 10, 1987', 'Gartree (HM Prison)', 'United Kingdom'] ['July 11, 1988', 'Santa Fe prison', 'United States'] ['April 17, 1989', 'Federal Holding Facility, Miami, FL', 'United States'] ['August 19, 1989', 'Arkansas Valley Correctional Facility', 'United States'] ['June 19, 1990', 'Kent Penitentiary, British Columbia', 'Canada'] ['1991', 'Rio Piedras State Penitentiary, Puerto Rico', 'Puerto Rico'] ['1992', 'Lyon Prison', 'France'] ['December 1992', 'Touraine Central Prison, Tours', 'France'] ['June 17, 1993', 'Touraine Central Prison, Tours', 'France'] ['December 30, 1996', 'High Security Prison, Santiago', 'Chile'] ['September 18, 1997', 'De Geerhorst jail', 'Netherlands'] ['March 25, 1999', 'Metropolitan Remand and Reception Centre', 'Australia'] ['June 5, 2000', 'Martin Treatment Center for Sexually Violent Predators, Martin County Florida', 'United States'] ['2000', 'Lyon prison', 'France'] ['2001', 'Luynes prison', 'France'] ['March 24, 2001', 'Draguignan prison', 'France'] ['May 28, 2001', 'Fresnes prison', 'France'] ['January 17, 2002', 'Parada Neto Penitentiary', 'Brazil'] ['December 30, 2002', 'Las Cucharas prison, Puerto Rico', 'United States'] ['2003', 'Luynes prison', 'France'] ['July 2005', 'France', 'France'] ['December 10, 2005', 'Aiton Prison', 'France'] ['June 6, 2006', 'Korydallos Prison', 'Greece'] ['April 15, 2007', 'Lantin Prison, Liège', 'Belgium'] ['July 15, 2007', 'Grasse prison', 'France'] ['October 28, 2007', 'Ittre prison', 'Belgium'] ['February 22, 2009', 'Korydallos Prison', 'Greece'] ['April 27, 2009', 'Domenjod Prison, Réunion', 'France'] ['July 23, 2009', 'Bruges', 'Belgium'] ['June 25, 2010', 'HM Prison Isle of Wight, Isle of Wight', 'United Kingdom'] ['March 22, 2012', 'Sheksna, Penal colony N17', 'Russia'] ['February 24, 2013', 'Trikala Prison, Trikala', 'Greece'] ['March 17, 2013', 'Saint-Jérôme Detention Facility, Quebec', 'Canada'] ['June 7, 2014', 'Orsainville Detention Facility, Quebec', 'Canada'] ['February 22, 2016', 'Thiva', 'Greece'] ['July 1, 2018', 'Réau, near Paris', 'France'] ['September 25, 2020', 'Forest prison, Brussels', 'Belgium']
We initialize an index variable with the value of 0. The purpose of this variable is to help us track which row we're modifying.
# Remove the details Column as it not needed in the analysis
index = 0
for row in data:
data[index] = row[:-1]
index += 1
print(row[:3])
['September 25, 2020', 'Forest prison, Brussels', 'Belgium']
In the code cell below, we iterate over data using the iterable variable row and:
row[0]
, we refer to the first entry of row
, i.e., the date.date = fetch_year(row[0])
, we're extracting the year out of the date in row[0]
and assiging it to the variable date
.row[0]
with the year that we just extracted.# Since we don’t need the month and day, thus, extract the year
for row in data:
(row[0]) = fetch_year(row[0])
# Print to confirm that the year has been extracted
print(row)
[2020, 'Forest prison, Brussels', 'Belgium', 'No', 'Kristel A.']
min_year = min(data, key=lambda x: x[0])[0]
max_year = max(data, key=lambda x: x[0])[0]
Before we continuing , let's check what are the earliest and latest dates we have in our dataset.
print(min_year) # the minimum year
print(max_year) # the maximum year
1971 2020
Now we'll create a list of all the years ranging from min_year to max_year. Our goal is to then determine how many prison break attempts were there for each year. Since years in which there weren't any prison breaks aren't present in the dataset, this will make sure we capture them.
years = []
for y in range(min_year, max_year + 1):
years.append(y)
Check the years
to see if it gives what we want
print(years)
[1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020]
The years look as expected
Now we create a list where each element looks like [<year>, 0]
.
attempts_per_year = []
for y in years:
attempts_per_year.append([y, 0])
for row in data:
for ya in attempts_per_year:
y = ya[0]
if row[0] == y:
ya[1] += 1
print(attempts_per_year)
[[1971, 1], [1972, 0], [1973, 1], [1974, 0], [1975, 0], [1976, 0], [1977, 0], [1978, 1], [1979, 0], [1980, 0], [1981, 2], [1982, 0], [1983, 1], [1984, 0], [1985, 2], [1986, 3], [1987, 1], [1988, 1], [1989, 2], [1990, 1], [1991, 1], [1992, 2], [1993, 1], [1994, 0], [1995, 0], [1996, 1], [1997, 1], [1998, 0], [1999, 1], [2000, 2], [2001, 3], [2002, 2], [2003, 1], [2004, 0], [2005, 2], [2006, 1], [2007, 3], [2008, 0], [2009, 3], [2010, 1], [2011, 0], [2012, 1], [2013, 2], [2014, 1], [2015, 0], [2016, 1], [2017, 0], [2018, 1], [2019, 0], [2020, 1]]
%matplotlib inline
barplot(attempts_per_year)
The years in which the most helicopter prison break attempts occurred were 1986, 2001, 2007 and 2009, with a total of three attempts each.
# countries_frequency = df["Country"].value_counts()
# print_pretty_table(countries_frequency)
frequency_by_countries = df["Country"].value_counts()
print_pretty_table(frequency_by_countries)
Country | Number of Occurrences |
---|---|
France | 15 |
United States | 8 |
Greece | 4 |
Canada | 4 |
Belgium | 4 |
United Kingdom | 2 |
Australia | 2 |
Brazil | 2 |
Russia | 1 |
Italy | 1 |
Puerto Rico | 1 |
Chile | 1 |
Netherlands | 1 |
Mexico | 1 |
Ireland | 1 |
df['Country'].value_counts().plot(kind='bar' , x = 'Country',y = 'No. of Escapes',title = 'Prison Breaks per Country',figsize = (12,6))
<matplotlib.axes._subplots.AxesSubplot at 0x7f9c3ed87700>
France has the highest number attempted helicopter prison escapes, follow by United States
df["Succeeded"].value_counts()
Yes 34 No 14 Name: Succeeded, dtype: int64
# Counting the number of successful and failed attempts
success_count = df.pivot_table(index ="Succeeded", values = "Country", aggfunc = "count")
success_count.columns = ["Counts"]
success_count.sort_index(inplace = True, ascending = False)
success_count
Counts | |
---|---|
Succeeded | |
Yes | 34 |
No | 14 |
plt.pie(data =success_count, x ="Counts", labels = ("Successful", "Not Successful"), autopct='%1.f%%')
plt.title("Percentages of Successful and Unsuccessful Helicopter Prison Breaks")
Text(0.5, 1.0, 'Percentages of Successful and Unsuccessful Helicopter Prison Breaks')
As depicted in the chart above, 71% (34) of the helicopter prison breaks recorded to be successful and the prisoners were able to escape with the helicopter, while 29% were recorded to be unsuccessful.
This project analysed the data of helicopter prison escapes from the year 1971 - 2020.
1: I can be deduced that the year 1986, 2001, 2007 and 2009 has the highest number of helicopter prison escapes with a total of three attempts each.
2: France has the highest number attempted helicopter prison escapes with 15 attempts, follow by United States with 8 attempst
3: The helicopter prison breaks have a higher chance of success in France
4: 71% (34) of the helicopter prison breaks recorded to be successful and the prisoners were able to escape with the helicopter, while 29% were recorded to be unsuccessful.
QUESTION: Which of these is correct, using We. or I while narting steps taken, e.g; Now we will create a list of all the years ranging from min_year to max_year
Thank you for reading and feel free to let me know your observations, corrections or suggestions.