One of the most controversial issues in the U.S. educational system is the effectiveness of standardized tests, and whether they're unfair to certain groups.
One of such exam is SAT. The SAT, or Scholastic Aptitude Test, is an exam that U.S. high school students take before applying to college. Colleges take the test scores into account when deciding who to admit, so it's fairly important to perform well on it.
In this project, we will collect data from various sources and determine whether SAT scores are unfair to certain groups.
New York City has a significant immigrant population and is very diverse, so comparing demographic factors such as race, income, and gender with SAT scores is a good way to determine whether the SAT is a fair test. For example, if certain racial groups consistently perform better on the SAT, we would have some evidence that the SAT is unfair.
For the purposes of this project, we'll be using data about New York City public schools, which can be found here.
Below are the datasets that we will be using:
SAT scores by school - SAT scores for each high school in New York City
School attendance - Attendance information for each school in New York City
Class size - Information on class size for each school
AP test results - Advanced Placement (AP) exam results for each high school (passing an optional AP exam in a particular subject can earn a student college credit in that subject)
Graduation outcomes - The percentage of students who graduated, and other outcome information
Demographics - Demographic information for each school
School survey - Surveys of parents, teachers, and students at each school
All of these data sets are interrelated. We'll combine all of them into a single data set combined which will be used in our analysis, to find correlations.
import pandas as pd
import numpy
import re
data_files = [
"ap_2010.csv",
"class_size.csv",
"demographics.csv",
"graduation.csv",
"hs_directory.csv",
"sat_results.csv"
]
data = {}
for f in data_files:
d = pd.read_csv("schools/{0}".format(f))
data[f.replace(".csv", "")] = d
for item in data:
print(item)
ap_2010 class_size demographics graduation hs_directory sat_results
data['ap_2010'].head()
DBN | SchoolName | AP Test Takers | Total Exams Taken | Number of Exams with scores 3 4 or 5 | |
---|---|---|---|---|---|
0 | 01M448 | UNIVERSITY NEIGHBORHOOD H.S. | 39 | 49 | 10 |
1 | 01M450 | EAST SIDE COMMUNITY HS | 19 | 21 | s |
2 | 01M515 | LOWER EASTSIDE PREP | 24 | 26 | 24 |
3 | 01M539 | NEW EXPLORATIONS SCI,TECH,MATH | 255 | 377 | 191 |
4 | 02M296 | High School of Hospitality Management | s | s | s |
data['class_size'].head()
CSD | BOROUGH | SCHOOL CODE | SCHOOL NAME | GRADE | PROGRAM TYPE | CORE SUBJECT (MS CORE and 9-12 ONLY) | CORE COURSE (MS CORE and 9-12 ONLY) | SERVICE CATEGORY(K-9* ONLY) | NUMBER OF STUDENTS / SEATS FILLED | NUMBER OF SECTIONS | AVERAGE CLASS SIZE | SIZE OF SMALLEST CLASS | SIZE OF LARGEST CLASS | DATA SOURCE | SCHOOLWIDE PUPIL-TEACHER RATIO | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | M | M015 | P.S. 015 Roberto Clemente | 0K | GEN ED | - | - | - | 19.0 | 1.0 | 19.0 | 19.0 | 19.0 | ATS | NaN |
1 | 1 | M | M015 | P.S. 015 Roberto Clemente | 0K | CTT | - | - | - | 21.0 | 1.0 | 21.0 | 21.0 | 21.0 | ATS | NaN |
2 | 1 | M | M015 | P.S. 015 Roberto Clemente | 01 | GEN ED | - | - | - | 17.0 | 1.0 | 17.0 | 17.0 | 17.0 | ATS | NaN |
3 | 1 | M | M015 | P.S. 015 Roberto Clemente | 01 | CTT | - | - | - | 17.0 | 1.0 | 17.0 | 17.0 | 17.0 | ATS | NaN |
4 | 1 | M | M015 | P.S. 015 Roberto Clemente | 02 | GEN ED | - | - | - | 15.0 | 1.0 | 15.0 | 15.0 | 15.0 | ATS | NaN |
data['demographics'].head()
DBN | Name | schoolyear | fl_percent | frl_percent | total_enrollment | prek | k | grade1 | grade2 | ... | black_num | black_per | hispanic_num | hispanic_per | white_num | white_per | male_num | male_per | female_num | female_per | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 01M015 | P.S. 015 ROBERTO CLEMENTE | 20052006 | 89.4 | NaN | 281 | 15 | 36 | 40 | 33 | ... | 74 | 26.3 | 189 | 67.3 | 5 | 1.8 | 158.0 | 56.2 | 123.0 | 43.8 |
1 | 01M015 | P.S. 015 ROBERTO CLEMENTE | 20062007 | 89.4 | NaN | 243 | 15 | 29 | 39 | 38 | ... | 68 | 28.0 | 153 | 63.0 | 4 | 1.6 | 140.0 | 57.6 | 103.0 | 42.4 |
2 | 01M015 | P.S. 015 ROBERTO CLEMENTE | 20072008 | 89.4 | NaN | 261 | 18 | 43 | 39 | 36 | ... | 77 | 29.5 | 157 | 60.2 | 7 | 2.7 | 143.0 | 54.8 | 118.0 | 45.2 |
3 | 01M015 | P.S. 015 ROBERTO CLEMENTE | 20082009 | 89.4 | NaN | 252 | 17 | 37 | 44 | 32 | ... | 75 | 29.8 | 149 | 59.1 | 7 | 2.8 | 149.0 | 59.1 | 103.0 | 40.9 |
4 | 01M015 | P.S. 015 ROBERTO CLEMENTE | 20092010 | 96.5 | 208 | 16 | 40 | 28 | 32 | ... | 67 | 32.2 | 118 | 56.7 | 6 | 2.9 | 124.0 | 59.6 | 84.0 | 40.4 |
5 rows × 38 columns
data['graduation'].head()
Demographic | DBN | School Name | Cohort | Total Cohort | Total Grads - n | Total Grads - % of cohort | Total Regents - n | Total Regents - % of cohort | Total Regents - % of grads | ... | Regents w/o Advanced - n | Regents w/o Advanced - % of cohort | Regents w/o Advanced - % of grads | Local - n | Local - % of cohort | Local - % of grads | Still Enrolled - n | Still Enrolled - % of cohort | Dropped Out - n | Dropped Out - % of cohort | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Total Cohort | 01M292 | HENRY STREET SCHOOL FOR INTERNATIONAL | 2003 | 5 | s | s | s | s | s | ... | s | s | s | s | s | s | s | s | s | s |
1 | Total Cohort | 01M292 | HENRY STREET SCHOOL FOR INTERNATIONAL | 2004 | 55 | 37 | 67.3% | 17 | 30.9% | 45.9% | ... | 17 | 30.9% | 45.9% | 20 | 36.4% | 54.1% | 15 | 27.3% | 3 | 5.5% |
2 | Total Cohort | 01M292 | HENRY STREET SCHOOL FOR INTERNATIONAL | 2005 | 64 | 43 | 67.2% | 27 | 42.2% | 62.8% | ... | 27 | 42.2% | 62.8% | 16 | 25% | 37.200000000000003% | 9 | 14.1% | 9 | 14.1% |
3 | Total Cohort | 01M292 | HENRY STREET SCHOOL FOR INTERNATIONAL | 2006 | 78 | 43 | 55.1% | 36 | 46.2% | 83.7% | ... | 36 | 46.2% | 83.7% | 7 | 9% | 16.3% | 16 | 20.5% | 11 | 14.1% |
4 | Total Cohort | 01M292 | HENRY STREET SCHOOL FOR INTERNATIONAL | 2006 Aug | 78 | 44 | 56.4% | 37 | 47.4% | 84.1% | ... | 37 | 47.4% | 84.1% | 7 | 9% | 15.9% | 15 | 19.2% | 11 | 14.1% |
5 rows × 23 columns
data['hs_directory'].head()
dbn | school_name | boro | building_code | phone_number | fax_number | grade_span_min | grade_span_max | expgrade_span_min | expgrade_span_max | ... | priority02 | priority03 | priority04 | priority05 | priority06 | priority07 | priority08 | priority09 | priority10 | Location 1 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 17K548 | Brooklyn School for Music & Theatre | Brooklyn | K440 | 718-230-6250 | 718-230-6262 | 9 | 12 | NaN | NaN | ... | Then to New York City residents | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 883 Classon Avenue\nBrooklyn, NY 11225\n(40.67... |
1 | 09X543 | High School for Violin and Dance | Bronx | X400 | 718-842-0687 | 718-589-9849 | 9 | 12 | NaN | NaN | ... | Then to New York City residents who attend an ... | Then to Bronx students or residents | Then to New York City residents | NaN | NaN | NaN | NaN | NaN | NaN | 1110 Boston Road\nBronx, NY 10456\n(40.8276026... |
2 | 09X327 | Comprehensive Model School Project M.S. 327 | Bronx | X240 | 718-294-8111 | 718-294-8109 | 6 | 12 | NaN | NaN | ... | Then to Bronx students or residents who attend... | Then to New York City residents who attend an ... | Then to Bronx students or residents | Then to New York City residents | NaN | NaN | NaN | NaN | NaN | 1501 Jerome Avenue\nBronx, NY 10452\n(40.84241... |
3 | 02M280 | Manhattan Early College School for Advertising | Manhattan | M520 | 718-935-3477 | NaN | 9 | 10 | 9 | 14.0 | ... | Then to New York City residents who attend an ... | Then to Manhattan students or residents | Then to New York City residents | NaN | NaN | NaN | NaN | NaN | NaN | 411 Pearl Street\nNew York, NY 10038\n(40.7106... |
4 | 28Q680 | Queens Gateway to Health Sciences Secondary Sc... | Queens | Q695 | 718-969-3155 | 718-969-3552 | 6 | 12 | NaN | NaN | ... | Then to Districts 28 and 29 students or residents | Then to Queens students or residents | Then to New York City residents | NaN | NaN | NaN | NaN | NaN | NaN | 160-20 Goethals Avenue\nJamaica, NY 11432\n(40... |
5 rows × 58 columns
data['sat_results'].head()
DBN | SCHOOL NAME | Num of SAT Test Takers | SAT Critical Reading Avg. Score | SAT Math Avg. Score | SAT Writing Avg. Score | |
---|---|---|---|---|---|---|
0 | 01M292 | HENRY STREET SCHOOL FOR INTERNATIONAL STUDIES | 29 | 355 | 404 | 363 |
1 | 01M448 | UNIVERSITY NEIGHBORHOOD HIGH SCHOOL | 91 | 383 | 423 | 366 |
2 | 01M450 | EAST SIDE COMMUNITY SCHOOL | 70 | 377 | 402 | 370 |
3 | 01M458 | FORSYTH SATELLITE ACADEMY | 7 | 414 | 401 | 359 |
4 | 01M509 | MARTA VALLE HIGH SCHOOL | 44 | 390 | 433 | 384 |
Each data set appears to either have a DBN column, or the information we need to create one. After some processing and making DBN
column in each dataset, we can combine columns of each dataset together in a single data set.
all_survey = pd.read_csv("schools/survey_all.txt", delimiter="\t", encoding='windows-1252')
d75_survey = pd.read_csv("schools/survey_d75.txt", delimiter="\t", encoding='windows-1252')
survey = pd.concat([all_survey, d75_survey], axis=0, sort=True)
survey.head()
N_p | N_s | N_t | aca_p_11 | aca_s_11 | aca_t_11 | aca_tot_11 | bn | com_p_11 | com_s_11 | ... | t_q8c_1 | t_q8c_2 | t_q8c_3 | t_q8c_4 | t_q9 | t_q9_1 | t_q9_2 | t_q9_3 | t_q9_4 | t_q9_5 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 90.0 | NaN | 22.0 | 7.8 | NaN | 7.9 | 7.9 | M015 | 7.6 | NaN | ... | 29.0 | 67.0 | 5.0 | 0.0 | NaN | 5.0 | 14.0 | 52.0 | 24.0 | 5.0 |
1 | 161.0 | NaN | 34.0 | 7.8 | NaN | 9.1 | 8.4 | M019 | 7.6 | NaN | ... | 74.0 | 21.0 | 6.0 | 0.0 | NaN | 3.0 | 6.0 | 3.0 | 78.0 | 9.0 |
2 | 367.0 | NaN | 42.0 | 8.6 | NaN | 7.5 | 8.0 | M020 | 8.3 | NaN | ... | 33.0 | 35.0 | 20.0 | 13.0 | NaN | 3.0 | 5.0 | 16.0 | 70.0 | 5.0 |
3 | 151.0 | 145.0 | 29.0 | 8.5 | 7.4 | 7.8 | 7.9 | M034 | 8.2 | 5.9 | ... | 21.0 | 45.0 | 28.0 | 7.0 | NaN | 0.0 | 18.0 | 32.0 | 39.0 | 11.0 |
4 | 90.0 | NaN | 23.0 | 7.9 | NaN | 8.1 | 8.0 | M063 | 7.9 | NaN | ... | 59.0 | 36.0 | 5.0 | 0.0 | NaN | 10.0 | 5.0 | 10.0 | 60.0 | 15.0 |
5 rows × 2773 columns
survey['dbn'].head()
0 01M015 1 01M019 2 01M020 3 01M034 4 01M063 Name: dbn, dtype: object
There are two immediate facts that we can see in the data:
We'll need to filter the columns to remove the ones we don't need. Luckily, there's a data dictionary at the original data download location. The dictionary tells us what each column represents. We'll pick columns having aggregate survey data about how parents, teachers, and students feel about school safety, academic performance, and more.
survey["DBN"] = survey["dbn"]
survey_fields = [
"DBN",
"rr_s",
"rr_t",
"rr_p",
"N_s",
"N_t",
"N_p",
"saf_p_11",
"com_p_11",
"eng_p_11",
"aca_p_11",
"saf_t_11",
"com_t_11",
"eng_t_11",
"aca_t_11",
"saf_s_11",
"com_s_11",
"eng_s_11",
"aca_s_11",
"saf_tot_11",
"com_tot_11",
"eng_tot_11",
"aca_tot_11",
]
survey = survey.loc[:,survey_fields]
data["survey"] = survey
data['survey'].head()
DBN | rr_s | rr_t | rr_p | N_s | N_t | N_p | saf_p_11 | com_p_11 | eng_p_11 | ... | eng_t_11 | aca_t_11 | saf_s_11 | com_s_11 | eng_s_11 | aca_s_11 | saf_tot_11 | com_tot_11 | eng_tot_11 | aca_tot_11 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 01M015 | NaN | 88 | 60 | NaN | 22.0 | 90.0 | 8.5 | 7.6 | 7.5 | ... | 7.6 | 7.9 | NaN | NaN | NaN | NaN | 8.0 | 7.7 | 7.5 | 7.9 |
1 | 01M019 | NaN | 100 | 60 | NaN | 34.0 | 161.0 | 8.4 | 7.6 | 7.6 | ... | 8.9 | 9.1 | NaN | NaN | NaN | NaN | 8.5 | 8.1 | 8.2 | 8.4 |
2 | 01M020 | NaN | 88 | 73 | NaN | 42.0 | 367.0 | 8.9 | 8.3 | 8.3 | ... | 6.8 | 7.5 | NaN | NaN | NaN | NaN | 8.2 | 7.3 | 7.5 | 8.0 |
3 | 01M034 | 89.0 | 73 | 50 | 145.0 | 29.0 | 151.0 | 8.8 | 8.2 | 8.0 | ... | 6.8 | 7.8 | 6.2 | 5.9 | 6.5 | 7.4 | 7.3 | 6.7 | 7.1 | 7.9 |
4 | 01M063 | NaN | 100 | 60 | NaN | 23.0 | 90.0 | 8.7 | 7.9 | 8.1 | ... | 7.8 | 8.1 | NaN | NaN | NaN | NaN | 8.5 | 7.6 | 7.9 | 8.0 |
5 rows × 23 columns
When we explored all of the data sets, we noticed that some of them, like class_size
and hs_directory
, don't have a DBN
column.
hs_directory
does have a dbn
column, though, so we can just rename it.class_size
doesn't appear to have the column at all. For making up DBN
column, we will add CSD
and SCHOOL CODE
columns and add a leading 0 to the CSD
if the CSD
is less than two digits longdata["hs_directory"]["DBN"] = data["hs_directory"]["dbn"]
def pad_csd(num):
string_representation = str(num)
if len(string_representation) > 1:
return string_representation
else:
return "0" + string_representation
data["class_size"]["padded_csd"] = data["class_size"]["CSD"].apply(pad_csd)
data["class_size"]["DBN"] = data["class_size"]["padded_csd"] + data["class_size"]["SCHOOL CODE"]
print(data['hs_directory']['DBN'].head())
data["class_size"]['DBN'].head()
0 17K548 1 09X543 2 09X327 3 02M280 4 28Q680 Name: DBN, dtype: object
0 01M015 1 01M015 2 01M015 3 01M015 4 01M015 Name: DBN, dtype: object
ap_2010
from object (string) data type to numeric data type.sat_results
. And then make up a new column that totals up the SAT scores.Location 1
in the hs_directory
data set. We will need to use some regex alongwith string manipulation to extract them in two new columns. And then convert them to numeric format.cols = ['SAT Math Avg. Score', 'SAT Critical Reading Avg. Score', 'SAT Writing Avg. Score']
for c in cols:
data["sat_results"][c] = pd.to_numeric(data["sat_results"][c], errors="coerce")
data['sat_results']['sat_score'] = data['sat_results'][cols[0]] + data['sat_results'][cols[1]] + data['sat_results'][cols[2]]
cols = ['AP Test Takers ', 'Total Exams Taken', 'Number of Exams with scores 3 4 or 5']
for col in cols:
data["ap_2010"][col] = pd.to_numeric(data["ap_2010"][col], errors="coerce")
def find_lat(loc):
coords = re.findall("\(.+, .+\)", loc)
lat = coords[0].split(",")[0].replace("(", "")
return lat
def find_lon(loc):
coords = re.findall("\(.+, .+\)", loc)
lon = coords[0].split(",")[1].replace(")", "").strip()
return lon
data["hs_directory"]["lat"] = data["hs_directory"]["Location 1"].apply(find_lat)
data["hs_directory"]["lon"] = data["hs_directory"]["Location 1"].apply(find_lon)
data["hs_directory"]["lat"] = pd.to_numeric(data["hs_directory"]["lat"], errors="coerce")
data["hs_directory"]["lon"] = pd.to_numeric(data["hs_directory"]["lon"], errors="coerce")
DBN
value per dataset. This will help us when while combining the data sets.
class_size
: each school has multiple values forGRADE, PROGRAM TYPE
,CORE SUBJECT (MS CORE and 9-12 ONLY)
, andCORE COURSE (MS CORE and 9-12 ONLY)
.
- We will filter on `GRADE(09-12)` and `PROGRAM TYPE('GEN ED')` to get information about only classes where SAT has been taken.
- And, then we will use groupby method on `DBN` to get unique groups per school. Taking average on the resulting groupby object using agg() method will give unique values per `DBN`.
demographics
: Here, only column that prevents a given DBN from being unique isschoolyear
. We only want to select rows whereschoolyear
is20112012
. This will give us the most recent year of data, and also match our SAT results data.
graduation
: TheDemographic
andCohort
columns are what preventDBN
from being unique in the graduation data. ACohort
appears to refer to the year the data represents, and theDemographic
appears to refer to a specific demographic group. In this case, we want to pick data from the most recentCohort
available, which is2006
. We also want data from the full cohort, so we'll only pick rows whereDemographic
isTotal Cohort
.
class_size = data["class_size"]
class_size = class_size[class_size["GRADE "] == "09-12"]
class_size = class_size[class_size["PROGRAM TYPE"] == "GEN ED"]
class_size = class_size.groupby("DBN").agg(numpy.mean)
class_size.reset_index(inplace=True)
data["class_size"] = class_size
data["demographics"] = data["demographics"][data["demographics"]["schoolyear"] == 20112012]
data["graduation"] = data["graduation"][data["graduation"]["Cohort"] == "2006"]
data["graduation"] = data["graduation"][data["graduation"]["Demographic"] == "Total Cohort"]
sat_results
with one dataset, then with another, then the result of that with another dataset. We'll continue combining data sets in this way until we've merged all of them. Afterwards, we'll have roughly the same number of rows, but each row will have columns from all of the data sets.sat_results
while minimizing null values. Therefore, we will use different merge strategies.demographics
, for example.combined = data["sat_results"]
combined = combined.merge(data["ap_2010"], on="DBN", how="left")
combined = combined.merge(data["graduation"], on="DBN", how="left")
to_merge = ["class_size", "demographics", "survey", "hs_directory"]
for m in to_merge:
combined = combined.merge(data[m], on="DBN", how="inner")
combined = combined.fillna(combined.mean())
combined = combined.fillna(0)
pd.set_option('display.max_columns', 500)
combined.head()
DBN | SCHOOL NAME | Num of SAT Test Takers | SAT Critical Reading Avg. Score | SAT Math Avg. Score | SAT Writing Avg. Score | sat_score | SchoolName | AP Test Takers | Total Exams Taken | Number of Exams with scores 3 4 or 5 | Demographic | School Name | Cohort | Total Cohort | Total Grads - n | Total Grads - % of cohort | Total Regents - n | Total Regents - % of cohort | Total Regents - % of grads | Advanced Regents - n | Advanced Regents - % of cohort | Advanced Regents - % of grads | Regents w/o Advanced - n | Regents w/o Advanced - % of cohort | Regents w/o Advanced - % of grads | Local - n | Local - % of cohort | Local - % of grads | Still Enrolled - n | Still Enrolled - % of cohort | Dropped Out - n | Dropped Out - % of cohort | CSD | NUMBER OF STUDENTS / SEATS FILLED | NUMBER OF SECTIONS | AVERAGE CLASS SIZE | SIZE OF SMALLEST CLASS | SIZE OF LARGEST CLASS | SCHOOLWIDE PUPIL-TEACHER RATIO | Name | schoolyear | fl_percent | frl_percent | total_enrollment | prek | k | grade1 | grade2 | grade3 | grade4 | grade5 | grade6 | grade7 | grade8 | grade9 | grade10 | grade11 | grade12 | ell_num | ell_percent | sped_num | sped_percent | ctt_num | selfcontained_num | asian_num | asian_per | black_num | black_per | hispanic_num | hispanic_per | white_num | white_per | male_num | male_per | female_num | female_per | rr_s | rr_t | rr_p | N_s | N_t | N_p | saf_p_11 | com_p_11 | eng_p_11 | aca_p_11 | saf_t_11 | com_t_11 | eng_t_11 | aca_t_11 | saf_s_11 | com_s_11 | eng_s_11 | aca_s_11 | saf_tot_11 | com_tot_11 | eng_tot_11 | aca_tot_11 | dbn | school_name | boro | building_code | phone_number | fax_number | grade_span_min | grade_span_max | expgrade_span_min | expgrade_span_max | bus | subway | primary_address_line_1 | city | state_code | zip | website | total_students | campus_name | school_type | overview_paragraph | program_highlights | language_classes | advancedplacement_courses | online_ap_courses | online_language_courses | extracurricular_activities | psal_sports_boys | psal_sports_girls | psal_sports_coed | school_sports | partner_cbo | partner_hospital | partner_highered | partner_cultural | partner_nonprofit | partner_corporate | partner_financial | partner_other | addtl_info1 | addtl_info2 | start_time | end_time | se_services | ell_programs | school_accessibility_description | number_programs | priority01 | priority02 | priority03 | priority04 | priority05 | priority06 | priority07 | priority08 | priority09 | priority10 | Location 1 | lat | lon | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 01M292 | HENRY STREET SCHOOL FOR INTERNATIONAL STUDIES | 29 | 355.0 | 404.0 | 363.0 | 1122.0 | 0 | 129.028846 | 197.038462 | 153.45 | Total Cohort | HENRY STREET SCHOOL FOR INTERNATIONAL | 2006 | 78.0 | 43 | 55.1% | 36 | 46.2% | 83.7% | 0 | 0% | 0% | 36 | 46.2% | 83.7% | 7 | 9% | 16.3% | 16 | 20.5% | 11 | 14.1% | 1 | 88.000000 | 4.000000 | 22.564286 | 18.500000 | 26.571429 | 0.0 | HENRY STREET SCHOOL FOR INTERNATIONAL STUDIES | 20112012 | 0 | 88.6 | 422 | 32 | 33 | 50 | 98 | 79 | 80 | 50 | 94.0 | 22.3 | 105.0 | 24.9 | 34 | 35 | 59 | 14.0 | 123 | 29.1 | 227 | 53.8 | 7 | 1.7 | 259.0 | 61.4 | 163.0 | 38.6 | 89.0 | 70 | 39 | 379.000000 | 26.0 | 151.0 | 7.8 | 7.7 | 7.4 | 7.6 | 6.3 | 5.3 | 6.1 | 6.5 | 6.000000 | 5.600000 | 6.100000 | 6.700000 | 6.7 | 6.2 | 6.6 | 7.0 | 01M292 | Henry Street School for International Studies | Manhattan | M056 | 212-406-9411 | 212-406-9417 | 6 | 12 | 0 | 12.0 | B39, M14A, M14D, M15, M15-SBS, M21, M22, M9 | B, D to Grand St ; F to East Broadway ; J, M, ... | 220 Henry Street | New York | NY | 10002 | http://schools.nyc.gov/schoolportals/01/M292 | 323.0 | 0 | 0 | Henry Street School for International Studies ... | Global/International Studies in core subjects,... | Chinese (Mandarin), Spanish | Psychology | Chinese Language and Culture, Spanish Literatu... | Chinese (Mandarin), Spanish | Math through Card Play; Art, Poetry/Spoken Wor... | Basketball | Softball | Soccer | Boxing, Track, CHAMPS, Tennis, Flag Football, ... | The Henry Street Settlement; Asia Society; Ame... | Gouverneur Hospital (Turning Points) | New York University | Asia Society | Heart of America Foundation | 0 | 0 | United Nations | 0 | 0 | 8:30 AM | 3:30 PM | This school will provide students with disabil... | ESL | Functionally Accessible | 1 | Priority to continuing 8th graders | Then to Manhattan students or residents who at... | Then to New York City residents who attend an ... | Then to Manhattan students or residents | Then to New York City residents | 0 | 0 | 0 | 0 | 0 | 220 Henry Street\nNew York, NY 10002\n(40.7137... | 40.713764 | -73.985260 | |||||||
1 | 01M448 | UNIVERSITY NEIGHBORHOOD HIGH SCHOOL | 91 | 383.0 | 423.0 | 366.0 | 1172.0 | UNIVERSITY NEIGHBORHOOD H.S. | 39.000000 | 49.000000 | 10.00 | Total Cohort | UNIVERSITY NEIGHBORHOOD HIGH SCHOOL | 2006 | 124.0 | 53 | 42.7% | 42 | 33.9% | 79.2% | 8 | 6.5% | 15.1% | 34 | 27.4% | 64.2% | 11 | 8.9% | 20.8% | 46 | 37.1% | 20 | 16.100000000000001% | 1 | 105.687500 | 4.750000 | 22.231250 | 18.250000 | 27.062500 | 0.0 | UNIVERSITY NEIGHBORHOOD HIGH SCHOOL | 20112012 | 0 | 71.8 | 394 | 109 | 97 | 93 | 95 | 83.0 | 21.1 | 86.0 | 21.8 | 55 | 10 | 115 | 29.2 | 89 | 22.6 | 181 | 45.9 | 9 | 2.3 | 226.0 | 57.4 | 168.0 | 42.6 | 84.0 | 95 | 10 | 385.000000 | 37.0 | 46.0 | 7.9 | 7.4 | 7.2 | 7.3 | 6.6 | 5.8 | 6.6 | 7.3 | 6.000000 | 5.700000 | 6.300000 | 7.000000 | 6.8 | 6.3 | 6.7 | 7.2 | 01M448 | University Neighborhood High School | Manhattan | M446 | 212-962-4341 | 212-267-5611 | 9 | 12 | 0 | 12.0 | M14A, M14D, M15, M21, M22, M9 | F to East Broadway ; J, M, Z to Delancey St-Es... | 200 Monroe Street | New York | NY | 10002 | www.universityneighborhoodhs.com | 299.0 | 0 | 0 | University Neighborhood High School (UNHS) is ... | While attending UNHS, students can earn up to ... | Chinese, Spanish | Calculus AB, Chinese Language and Culture, Eng... | 0 | Chinese (Cantonese), Chinese (Mandarin), Spanish | Basketball, Badminton, Handball, Glee, Dance, ... | Baseball, Basketball, Bowling, Cross Country, ... | Basketball, Bowling, Cross Country, Softball, ... | 0 | 0 | Grand Street Settlement, Henry Street Settleme... | Gouverneur Hospital, The Door, The Mount Sinai... | New York University, CUNY Baruch College, Pars... | Dance Film Association, Dance Makers Film Work... | W!SE, Big Brothers Big Sisters, Peer Health Ex... | Deloitte LLP Consulting and Financial Services... | 0 | Movement Research | Incoming students are expected to attend schoo... | Community Service Requirement, Dress Code Requ... | 8:15 AM | 3:15 PM | This school will provide students with disabil... | ESL | Not Functionally Accessible | 3 | Open to New York City residents | For M35B only: Open only to students whose hom... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 200 Monroe Street\nNew York, NY 10002\n(40.712... | 40.712332 | -73.984797 | ||||||||||
2 | 01M450 | EAST SIDE COMMUNITY SCHOOL | 70 | 377.0 | 402.0 | 370.0 | 1149.0 | EAST SIDE COMMUNITY HS | 19.000000 | 21.000000 | 153.45 | Total Cohort | EAST SIDE COMMUNITY SCHOOL | 2006 | 90.0 | 70 | 77.8% | 67 | 74.400000000000006% | 95.7% | 0 | 0% | 0% | 67 | 74.400000000000006% | 95.7% | 3 | 3.3% | 4.3% | 15 | 16.7% | 5 | 5.6% | 1 | 57.600000 | 2.733333 | 21.200000 | 19.400000 | 22.866667 | 0.0 | EAST SIDE COMMUNITY HIGH SCHOOL | 20112012 | 0 | 71.8 | 598 | 92 | 73 | 76 | 101 | 93 | 77 | 86 | 30.0 | 5.0 | 158.0 | 26.4 | 91 | 19 | 58 | 9.7 | 143 | 23.9 | 331 | 55.4 | 62 | 10.4 | 327.0 | 54.7 | 271.0 | 45.3 | 0.0 | 98 | 28 | 598.208333 | 42.0 | 150.0 | 8.7 | 8.2 | 8.1 | 8.4 | 7.3 | 8.0 | 8.0 | 8.8 | 6.611667 | 6.094722 | 6.620278 | 7.381389 | 7.9 | 7.9 | 7.9 | 8.4 | 01M450 | East Side Community School | Manhattan | M060 | 212-460-8467 | 212-260-9657 | 6 | 12 | 0 | 12.0 | M101, M102, M103, M14A, M14D, M15, M15-SBS, M2... | 6 to Astor Place ; L to 1st Ave | 420 East 12 Street | New York | NY | 10009 | www.eschs.org | 649.0 | 0 | Consortium School | We are a small 6-12 secondary school that prep... | Our Advisory System ensures that we can effect... | 0 | Calculus AB, English Literature and Composition | 0 | American Sign Language, Arabic, Chinese (Manda... | After-School Tutoring, Art Portfolio Classes, ... | Baseball, Basketball, Soccer | Basketball, Soccer, Softball | 0 | Basketball, Bicycling, Fitness, Flag Football,... | University Settlement, Big Brothers Big Sister... | 0 | Columbia Teachers College, New York University... | , Internship Program, Loisaida Art Gallery loc... | College Bound Initiative, Center for Collabora... | Prudential Securities, Moore Capital, Morgan S... | 0 | Brooklyn Boulders (Rock Climbing) | Students present and defend their work to comm... | Our school requires an Academic Portfolio for ... | 8:30 AM | 3:30 PM | This school will provide students with disabil... | ESL | Not Functionally Accessible | 1 | Priority to continuing 8th graders | Then to New York City residents | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 420 East 12 Street\nNew York, NY 10009\n(40.72... | 40.729783 | -73.983041 | |||||||
3 | 01M509 | MARTA VALLE HIGH SCHOOL | 44 | 390.0 | 433.0 | 384.0 | 1207.0 | 0 | 129.028846 | 197.038462 | 153.45 | Total Cohort | MARTA VALLE HIGH SCHOOL | 2006 | 84.0 | 47 | 56% | 40 | 47.6% | 85.1% | 17 | 20.2% | 36.200000000000003% | 23 | 27.4% | 48.9% | 7 | 8.300000000000001% | 14.9% | 25 | 29.8% | 5 | 6% | 1 | 69.642857 | 3.000000 | 23.571429 | 20.000000 | 27.357143 | 0.0 | MARTA VALLE SECONDARY SCHOOL | 20112012 | 0 | 80.7 | 367 | 143 | 100 | 51 | 73 | 41.0 | 11.2 | 95.0 | 25.9 | 28 | 36 | 34 | 9.3 | 116 | 31.6 | 209 | 56.9 | 6 | 1.6 | 170.0 | 46.3 | 197.0 | 53.7 | 90.0 | 100 | 21 | 306.000000 | 29.0 | 69.0 | 7.7 | 7.4 | 7.2 | 7.3 | 6.4 | 5.3 | 6.1 | 6.8 | 6.400000 | 5.900000 | 6.400000 | 7.000000 | 6.9 | 6.2 | 6.6 | 7.0 | 01M509 | Marta Valle High School | Manhattan | M025 | 212-473-8152 | 212-475-7588 | 9 | 12 | 0 | 12.0 | B39, M103, M14A, M14D, M15, M15-SBS, M21, M22,... | B, D to Grand St ; F, J, M, Z to Delancey St-E... | 145 Stanton Street | New York | NY | 10002 | www.martavalle.org | 401.0 | 0 | 0 | Marta Valle High School (MVHS) offers a strong... | Advanced Regents Diploma, Early Graduation, up... | French, Spanish | English Literature and Composition, Studio Art... | 0 | Spanish | Model Peer Leadership Program, 'The Vine' Stud... | Basketball, Rugby | Rugby, Volleyball | Rugby | Volleyball, Zumba | NYCDOE Innovation Zone Lab Site, Grand Street ... | Gouvenuer's Hospital | New York University (NYU), Sarah Lawrence Coll... | Young Audiences, The National Arts Club, Educa... | College for Every Student (CFES), Morningside ... | Estée Lauder | Bank of America | CASALEAP, Beacon | Students Dress for Success, Summer Bridge to S... | Community Service Requirement, Extended Day Pr... | 8:00 AM | 3:30 PM | This school will provide students with disabil... | ESL | Functionally Accessible | 1 | Priority to District 1 students or residents | Then to Manhattan students or residents | Then to New York City residents | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 145 Stanton Street\nNew York, NY 10002\n(40.72... | 40.720569 | -73.985673 | ||||||||||
4 | 01M539 | NEW EXPLORATIONS INTO SCIENCE, TECHNOLOGY AND ... | 159 | 522.0 | 574.0 | 525.0 | 1621.0 | NEW EXPLORATIONS SCI,TECH,MATH | 255.000000 | 377.000000 | 191.00 | Total Cohort | NEW EXPLORATIONS INTO SCIENCE TECHNO | 2006 | 46.0 | 46 | 100% | 46 | 100% | 100% | 31 | 67.400000000000006% | 67.400000000000006% | 15 | 32.6% | 32.6% | 0 | 0% | 0% | 0 | 0% | 0 | 0% | 1 | 156.368421 | 6.157895 | 25.510526 | 19.473684 | 31.210526 | 0.0 | NEW EXPLORATIONS INTO SCIENCE TECH AND MATH | 20112012 | 0 | 23.0 | 1613 | 100 | 107 | 139 | 110 | 114 | 107 | 149 | 126 | 117 | 117 | 123 | 147 | 157 | 4.0 | 0.2 | 43.0 | 2.7 | 2 | 0 | 448 | 27.8 | 189 | 11.7 | 229 | 14.2 | 725 | 44.9 | 794.0 | 49.2 | 819.0 | 50.8 | 98.0 | 68 | 51 | 923.000000 | 67.0 | 736.0 | 8.5 | 7.9 | 7.9 | 8.4 | 7.6 | 5.6 | 5.9 | 7.3 | 7.300000 | 6.400000 | 7.000000 | 7.700000 | 7.8 | 6.7 | 6.9 | 7.8 | 01M539 | New Explorations into Science, Technology and ... | Manhattan | M022 | 212-677-5190 | 212-260-8124 | K | 12 | 0 | 12.0 | B39, M14A, M14D, M21, M22, M8, M9 | F, J, M, Z to Delancey St-Essex St | 111 Columbia Street | New York | NY | 10002 | www.nestmk12.net | 1725.0 | 0 | 0 | New Explorations into Science, Technology and ... | 1st level science sequence - 9th grade: Regent... | Chinese (Mandarin), French, Italian, Latin, Sp... | Biology, Calculus AB, Calculus BC, Chemistry, ... | 0 | 0 | After-school Jazz Band, Annual Coffee House Co... | Basketball, Fencing, Indoor Track | Basketball, Fencing, Indoor Track | 0 | Badminton, Baseball, Cross-Country, Dance, Out... | 7th Precinct Community Affairs, NYCWastele$$, ... | 0 | Hunter College, New York University, Cornell U... | VH1, Dancing Classrooms, Center for Arts Educa... | After 3 | Time Warner Cable, Google, IBM, MET Project, S... | 0 | 0 | Dress Code Required: Business Casual - shirt/b... | 0 | 8:15 AM | 4:00 PM | This school will provide students with disabil... | ESL | Not Functionally Accessible | 1 | Priority to continuing 8th graders | Then to New York City residents | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 111 Columbia Street\nNew York, NY 10002\n(40.7... | 40.718725 | -73.979426 |
Now, we have a clean data set on which we can base our analysis. Mapping the statistics out on a school district level might be an interesting way to analyze them. Adding a column to the data set that specifies the school district will help us accomplish this.
def get_first_two_chars(dbn):
return dbn[0:2]
combined["school_dist"] = combined["DBN"].apply(get_first_two_chars)
combined.head(3)
DBN | SCHOOL NAME | Num of SAT Test Takers | SAT Critical Reading Avg. Score | SAT Math Avg. Score | SAT Writing Avg. Score | sat_score | SchoolName | AP Test Takers | Total Exams Taken | Number of Exams with scores 3 4 or 5 | Demographic | School Name | Cohort | Total Cohort | Total Grads - n | Total Grads - % of cohort | Total Regents - n | Total Regents - % of cohort | Total Regents - % of grads | Advanced Regents - n | Advanced Regents - % of cohort | Advanced Regents - % of grads | Regents w/o Advanced - n | Regents w/o Advanced - % of cohort | Regents w/o Advanced - % of grads | Local - n | Local - % of cohort | Local - % of grads | Still Enrolled - n | Still Enrolled - % of cohort | Dropped Out - n | Dropped Out - % of cohort | CSD | NUMBER OF STUDENTS / SEATS FILLED | NUMBER OF SECTIONS | AVERAGE CLASS SIZE | SIZE OF SMALLEST CLASS | SIZE OF LARGEST CLASS | SCHOOLWIDE PUPIL-TEACHER RATIO | Name | schoolyear | fl_percent | frl_percent | total_enrollment | prek | k | grade1 | grade2 | grade3 | grade4 | grade5 | grade6 | grade7 | grade8 | grade9 | grade10 | grade11 | grade12 | ell_num | ell_percent | sped_num | sped_percent | ctt_num | selfcontained_num | asian_num | asian_per | black_num | black_per | hispanic_num | hispanic_per | white_num | white_per | male_num | male_per | female_num | female_per | rr_s | rr_t | rr_p | N_s | N_t | N_p | saf_p_11 | com_p_11 | eng_p_11 | aca_p_11 | saf_t_11 | com_t_11 | eng_t_11 | aca_t_11 | saf_s_11 | com_s_11 | eng_s_11 | aca_s_11 | saf_tot_11 | com_tot_11 | eng_tot_11 | aca_tot_11 | dbn | school_name | boro | building_code | phone_number | fax_number | grade_span_min | grade_span_max | expgrade_span_min | expgrade_span_max | bus | subway | primary_address_line_1 | city | state_code | zip | website | total_students | campus_name | school_type | overview_paragraph | program_highlights | language_classes | advancedplacement_courses | online_ap_courses | online_language_courses | extracurricular_activities | psal_sports_boys | psal_sports_girls | psal_sports_coed | school_sports | partner_cbo | partner_hospital | partner_highered | partner_cultural | partner_nonprofit | partner_corporate | partner_financial | partner_other | addtl_info1 | addtl_info2 | start_time | end_time | se_services | ell_programs | school_accessibility_description | number_programs | priority01 | priority02 | priority03 | priority04 | priority05 | priority06 | priority07 | priority08 | priority09 | priority10 | Location 1 | lat | lon | school_dist | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 01M292 | HENRY STREET SCHOOL FOR INTERNATIONAL STUDIES | 29 | 355.0 | 404.0 | 363.0 | 1122.0 | 0 | 129.028846 | 197.038462 | 153.45 | Total Cohort | HENRY STREET SCHOOL FOR INTERNATIONAL | 2006 | 78.0 | 43 | 55.1% | 36 | 46.2% | 83.7% | 0 | 0% | 0% | 36 | 46.2% | 83.7% | 7 | 9% | 16.3% | 16 | 20.5% | 11 | 14.1% | 1 | 88.0000 | 4.000000 | 22.564286 | 18.50 | 26.571429 | 0.0 | HENRY STREET SCHOOL FOR INTERNATIONAL STUDIES | 20112012 | 0 | 88.6 | 422 | 32 | 33 | 50 | 98 | 79 | 80 | 50 | 94.0 | 22.3 | 105.0 | 24.9 | 34 | 35 | 59 | 14.0 | 123 | 29.1 | 227 | 53.8 | 7 | 1.7 | 259.0 | 61.4 | 163.0 | 38.6 | 89.0 | 70 | 39 | 379.000000 | 26.0 | 151.0 | 7.8 | 7.7 | 7.4 | 7.6 | 6.3 | 5.3 | 6.1 | 6.5 | 6.000000 | 5.600000 | 6.100000 | 6.700000 | 6.7 | 6.2 | 6.6 | 7.0 | 01M292 | Henry Street School for International Studies | Manhattan | M056 | 212-406-9411 | 212-406-9417 | 6 | 12 | 0 | 12.0 | B39, M14A, M14D, M15, M15-SBS, M21, M22, M9 | B, D to Grand St ; F to East Broadway ; J, M, ... | 220 Henry Street | New York | NY | 10002 | http://schools.nyc.gov/schoolportals/01/M292 | 323.0 | 0 | 0 | Henry Street School for International Studies ... | Global/International Studies in core subjects,... | Chinese (Mandarin), Spanish | Psychology | Chinese Language and Culture, Spanish Literatu... | Chinese (Mandarin), Spanish | Math through Card Play; Art, Poetry/Spoken Wor... | Basketball | Softball | Soccer | Boxing, Track, CHAMPS, Tennis, Flag Football, ... | The Henry Street Settlement; Asia Society; Ame... | Gouverneur Hospital (Turning Points) | New York University | Asia Society | Heart of America Foundation | 0 | 0 | United Nations | 0 | 0 | 8:30 AM | 3:30 PM | This school will provide students with disabil... | ESL | Functionally Accessible | 1 | Priority to continuing 8th graders | Then to Manhattan students or residents who at... | Then to New York City residents who attend an ... | Then to Manhattan students or residents | Then to New York City residents | 0 | 0 | 0 | 0 | 0 | 220 Henry Street\nNew York, NY 10002\n(40.7137... | 40.713764 | -73.985260 | 01 | |||||||
1 | 01M448 | UNIVERSITY NEIGHBORHOOD HIGH SCHOOL | 91 | 383.0 | 423.0 | 366.0 | 1172.0 | UNIVERSITY NEIGHBORHOOD H.S. | 39.000000 | 49.000000 | 10.00 | Total Cohort | UNIVERSITY NEIGHBORHOOD HIGH SCHOOL | 2006 | 124.0 | 53 | 42.7% | 42 | 33.9% | 79.2% | 8 | 6.5% | 15.1% | 34 | 27.4% | 64.2% | 11 | 8.9% | 20.8% | 46 | 37.1% | 20 | 16.100000000000001% | 1 | 105.6875 | 4.750000 | 22.231250 | 18.25 | 27.062500 | 0.0 | UNIVERSITY NEIGHBORHOOD HIGH SCHOOL | 20112012 | 0 | 71.8 | 394 | 109 | 97 | 93 | 95 | 83.0 | 21.1 | 86.0 | 21.8 | 55 | 10 | 115 | 29.2 | 89 | 22.6 | 181 | 45.9 | 9 | 2.3 | 226.0 | 57.4 | 168.0 | 42.6 | 84.0 | 95 | 10 | 385.000000 | 37.0 | 46.0 | 7.9 | 7.4 | 7.2 | 7.3 | 6.6 | 5.8 | 6.6 | 7.3 | 6.000000 | 5.700000 | 6.300000 | 7.000000 | 6.8 | 6.3 | 6.7 | 7.2 | 01M448 | University Neighborhood High School | Manhattan | M446 | 212-962-4341 | 212-267-5611 | 9 | 12 | 0 | 12.0 | M14A, M14D, M15, M21, M22, M9 | F to East Broadway ; J, M, Z to Delancey St-Es... | 200 Monroe Street | New York | NY | 10002 | www.universityneighborhoodhs.com | 299.0 | 0 | 0 | University Neighborhood High School (UNHS) is ... | While attending UNHS, students can earn up to ... | Chinese, Spanish | Calculus AB, Chinese Language and Culture, Eng... | 0 | Chinese (Cantonese), Chinese (Mandarin), Spanish | Basketball, Badminton, Handball, Glee, Dance, ... | Baseball, Basketball, Bowling, Cross Country, ... | Basketball, Bowling, Cross Country, Softball, ... | 0 | 0 | Grand Street Settlement, Henry Street Settleme... | Gouverneur Hospital, The Door, The Mount Sinai... | New York University, CUNY Baruch College, Pars... | Dance Film Association, Dance Makers Film Work... | W!SE, Big Brothers Big Sisters, Peer Health Ex... | Deloitte LLP Consulting and Financial Services... | 0 | Movement Research | Incoming students are expected to attend schoo... | Community Service Requirement, Dress Code Requ... | 8:15 AM | 3:15 PM | This school will provide students with disabil... | ESL | Not Functionally Accessible | 3 | Open to New York City residents | For M35B only: Open only to students whose hom... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 200 Monroe Street\nNew York, NY 10002\n(40.712... | 40.712332 | -73.984797 | 01 | ||||||||||
2 | 01M450 | EAST SIDE COMMUNITY SCHOOL | 70 | 377.0 | 402.0 | 370.0 | 1149.0 | EAST SIDE COMMUNITY HS | 19.000000 | 21.000000 | 153.45 | Total Cohort | EAST SIDE COMMUNITY SCHOOL | 2006 | 90.0 | 70 | 77.8% | 67 | 74.400000000000006% | 95.7% | 0 | 0% | 0% | 67 | 74.400000000000006% | 95.7% | 3 | 3.3% | 4.3% | 15 | 16.7% | 5 | 5.6% | 1 | 57.6000 | 2.733333 | 21.200000 | 19.40 | 22.866667 | 0.0 | EAST SIDE COMMUNITY HIGH SCHOOL | 20112012 | 0 | 71.8 | 598 | 92 | 73 | 76 | 101 | 93 | 77 | 86 | 30.0 | 5.0 | 158.0 | 26.4 | 91 | 19 | 58 | 9.7 | 143 | 23.9 | 331 | 55.4 | 62 | 10.4 | 327.0 | 54.7 | 271.0 | 45.3 | 0.0 | 98 | 28 | 598.208333 | 42.0 | 150.0 | 8.7 | 8.2 | 8.1 | 8.4 | 7.3 | 8.0 | 8.0 | 8.8 | 6.611667 | 6.094722 | 6.620278 | 7.381389 | 7.9 | 7.9 | 7.9 | 8.4 | 01M450 | East Side Community School | Manhattan | M060 | 212-460-8467 | 212-260-9657 | 6 | 12 | 0 | 12.0 | M101, M102, M103, M14A, M14D, M15, M15-SBS, M2... | 6 to Astor Place ; L to 1st Ave | 420 East 12 Street | New York | NY | 10009 | www.eschs.org | 649.0 | 0 | Consortium School | We are a small 6-12 secondary school that prep... | Our Advisory System ensures that we can effect... | 0 | Calculus AB, English Literature and Composition | 0 | American Sign Language, Arabic, Chinese (Manda... | After-School Tutoring, Art Portfolio Classes, ... | Baseball, Basketball, Soccer | Basketball, Soccer, Softball | 0 | Basketball, Bicycling, Fitness, Flag Football,... | University Settlement, Big Brothers Big Sister... | 0 | Columbia Teachers College, New York University... | , Internship Program, Loisaida Art Gallery loc... | College Bound Initiative, Center for Collabora... | Prudential Securities, Moore Capital, Morgan S... | 0 | Brooklyn Boulders (Rock Climbing) | Students present and defend their work to comm... | Our school requires an Academic Portfolio for ... | 8:30 AM | 3:30 PM | This school will provide students with disabil... | ESL | Not Functionally Accessible | 1 | Priority to continuing 8th graders | Then to New York City residents | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 420 East 12 Street\nNew York, NY 10009\n(40.72... | 40.729783 | -73.983041 | 01 |
correlations = combined.corr()
correlations = correlations["sat_score"]
print(correlations)
SAT Critical Reading Avg. Score 0.986820 SAT Math Avg. Score 0.972643 SAT Writing Avg. Score 0.987771 sat_score 1.000000 AP Test Takers 0.523140 Total Exams Taken 0.514333 Number of Exams with scores 3 4 or 5 0.463245 Total Cohort 0.325144 CSD 0.042948 NUMBER OF STUDENTS / SEATS FILLED 0.394626 NUMBER OF SECTIONS 0.362673 AVERAGE CLASS SIZE 0.381014 SIZE OF SMALLEST CLASS 0.249949 SIZE OF LARGEST CLASS 0.314434 SCHOOLWIDE PUPIL-TEACHER RATIO NaN schoolyear NaN fl_percent NaN frl_percent -0.722225 total_enrollment 0.367857 ell_num -0.153778 ell_percent -0.398750 sped_num 0.034933 sped_percent -0.448170 asian_num 0.475445 asian_per 0.570730 black_num 0.027979 black_per -0.284139 hispanic_num 0.025744 hispanic_per -0.396985 white_num 0.449559 ... rr_p 0.047925 N_s 0.423463 N_t 0.291463 N_p 0.421530 saf_p_11 0.122913 com_p_11 -0.115073 eng_p_11 0.020254 aca_p_11 0.035155 saf_t_11 0.313810 com_t_11 0.082419 eng_t_11 0.036906 aca_t_11 0.132348 saf_s_11 0.337639 com_s_11 0.187370 eng_s_11 0.213822 aca_s_11 0.339435 saf_tot_11 0.318753 com_tot_11 0.077310 eng_tot_11 0.100102 aca_tot_11 0.190966 grade_span_max NaN expgrade_span_max NaN zip -0.063977 total_students 0.407827 number_programs 0.117012 priority08 NaN priority09 NaN priority10 NaN lat -0.121029 lon -0.132222 Name: sat_score, Length: 67, dtype: float64
# Remove DBN since it's a unique identifier, not a useful numerical value for correlation.
survey_fields.remove("DBN")
import matplotlib.pyplot as plt
combined.corr()['sat_score'][survey_fields].plot.bar()
plt.show()
<Figure size 640x480 with 1 Axes>
Key Observations:
N_s
, N_t
, N_p
correlate highly with sat_score
. Since these fields are directly related to total enrollment, this can be understood.rr_s
(Student Response Rate) however is the more interesting point. Since, students who excel academically are more likely to respond to a survey regarding sat_score
.saf_t_11
, saf_s_11
, saf_tot_11
gives us another good perspective about the relationship between a safe environment and academic brilliance. Schools where Students and Teachers feel safe are more likely to have good sat_score
.aca_s_11
i.e. how students percieve academic standards correlates highly for sat_score
. However, same is not true for aca_p_11
and aca_t_11
, i.e. how Parents and Teachers perceive academic standards.combined.plot.scatter('saf_s_11', 'sat_score')
plt.show()
From the Scatter Plot, we can observe that there seems to be a positive correlation between the sat_score
and saf_s_11
, although its not that strong.
districts = combined.groupby('school_dist').agg(numpy.mean)
districts.reset_index(inplace=True)
districts
school_dist | SAT Critical Reading Avg. Score | SAT Math Avg. Score | SAT Writing Avg. Score | sat_score | AP Test Takers | Total Exams Taken | Number of Exams with scores 3 4 or 5 | Total Cohort | CSD | NUMBER OF STUDENTS / SEATS FILLED | NUMBER OF SECTIONS | AVERAGE CLASS SIZE | SIZE OF SMALLEST CLASS | SIZE OF LARGEST CLASS | SCHOOLWIDE PUPIL-TEACHER RATIO | schoolyear | fl_percent | frl_percent | total_enrollment | ell_num | ell_percent | sped_num | sped_percent | asian_num | asian_per | black_num | black_per | hispanic_num | hispanic_per | white_num | white_per | male_num | male_per | female_num | female_per | rr_s | rr_t | rr_p | N_s | N_t | N_p | saf_p_11 | com_p_11 | eng_p_11 | aca_p_11 | saf_t_11 | com_t_11 | eng_t_11 | aca_t_11 | saf_s_11 | com_s_11 | eng_s_11 | aca_s_11 | saf_tot_11 | com_tot_11 | eng_tot_11 | aca_tot_11 | grade_span_max | expgrade_span_max | zip | total_students | number_programs | priority08 | priority09 | priority10 | lat | lon | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 01 | 441.833333 | 473.333333 | 439.333333 | 1354.500000 | 116.681090 | 173.019231 | 135.800000 | 93.500000 | 1.0 | 115.244241 | 5.148538 | 22.675415 | 18.798392 | 26.553044 | 0.0 | 20112012.0 | 0.0 | 58.983333 | 668.500000 | 42.166667 | 10.000000 | 82.000000 | 17.083333 | 134.500000 | 17.516667 | 125.500000 | 22.333333 | 214.833333 | 40.733333 | 186.000000 | 18.450000 | 328.166667 | 50.050000 | 340.333333 | 49.950000 | 76.500000 | 85.333333 | 33.166667 | 525.368056 | 38.500000 | 239.166667 | 8.233333 | 7.800000 | 7.683333 | 8.016667 | 7.066667 | 6.233333 | 6.700000 | 7.500000 | 6.768611 | 6.165787 | 6.736713 | 7.446898 | 7.433333 | 6.816667 | 7.116667 | 7.683333 | 12.0 | 12.0 | 10003.166667 | 659.500000 | 1.333333 | 0.0 | 0.0 | 0.0 | 40.719022 | -73.982377 |
1 | 02 | 426.619092 | 444.186256 | 424.832836 | 1295.638184 | 128.908454 | 201.516827 | 157.495833 | 158.647849 | 2.0 | 149.818949 | 5.686360 | 25.038118 | 20.662667 | 28.593861 | 0.0 | 20112012.0 | 0.0 | 63.164583 | 605.604167 | 51.625000 | 13.006250 | 70.770833 | 13.212500 | 118.395833 | 14.479167 | 141.145833 | 24.733333 | 271.458333 | 49.554167 | 68.541667 | 10.056250 | 266.500000 | 44.718750 | 339.104167 | 55.281250 | 83.312500 | 86.416667 | 38.333333 | 495.171007 | 31.687500 | 190.000000 | 8.254167 | 7.495833 | 7.379167 | 7.704167 | 7.404167 | 6.520833 | 7.120833 | 7.564583 | 6.910660 | 6.204057 | 6.650422 | 7.385029 | 7.520833 | 6.720833 | 7.037500 | 7.541667 | 12.0 | 12.0 | 10023.770833 | 621.395833 | 1.416667 | 0.0 | 0.0 | 0.0 | 40.739699 | -73.991386 |
2 | 03 | 428.529851 | 437.997512 | 426.915672 | 1293.443035 | 156.183494 | 244.522436 | 193.087500 | 183.384409 | 3.0 | 156.005994 | 5.839200 | 23.716311 | 19.737593 | 27.122831 | 0.0 | 20112012.0 | 0.0 | 58.050000 | 661.416667 | 36.916667 | 9.025000 | 63.583333 | 14.500000 | 68.000000 | 6.408333 | 181.833333 | 34.000000 | 228.250000 | 44.533333 | 172.000000 | 13.725000 | 272.166667 | 47.950000 | 389.250000 | 52.050000 | 83.166667 | 80.833333 | 36.166667 | 519.250000 | 28.166667 | 206.416667 | 8.316667 | 7.541667 | 7.508333 | 7.758333 | 6.675000 | 5.991667 | 6.533333 | 6.991667 | 6.716667 | 6.258333 | 6.633333 | 7.375000 | 7.233333 | 6.616667 | 6.891667 | 7.391667 | 12.0 | 12.0 | 10023.750000 | 717.916667 | 2.000000 | 0.0 | 0.0 | 0.0 | 40.781574 | -73.977370 |
3 | 04 | 402.142857 | 416.285714 | 405.714286 | 1224.142857 | 129.016484 | 183.879121 | 151.035714 | 113.857143 | 4.0 | 132.362265 | 5.192610 | 24.101048 | 19.740816 | 27.460291 | 0.0 | 20112012.0 | 0.0 | 71.000000 | 569.285714 | 21.571429 | 4.557143 | 58.571429 | 13.214286 | 61.142857 | 6.128571 | 151.714286 | 30.028571 | 341.571429 | 61.400000 | 9.285714 | 1.314286 | 193.285714 | 30.514286 | 376.000000 | 69.485714 | 87.571429 | 92.714286 | 41.285714 | 504.857143 | 32.428571 | 193.142857 | 8.385714 | 7.657143 | 7.500000 | 7.928571 | 7.871429 | 7.042857 | 7.428571 | 7.885714 | 6.885714 | 6.114286 | 6.685714 | 7.571429 | 7.714286 | 6.942857 | 7.185714 | 7.785714 | 12.0 | 12.0 | 10029.857143 | 580.857143 | 1.142857 | 0.0 | 0.0 | 0.0 | 40.793449 | -73.943215 |
4 | 05 | 427.159915 | 438.236674 | 419.666098 | 1285.062687 | 85.722527 | 115.725275 | 142.464286 | 143.677419 | 5.0 | 120.623901 | 4.845101 | 24.290247 | 20.447527 | 27.746703 | 0.0 | 20112012.0 | 0.0 | 58.671429 | 605.857143 | 21.571429 | 4.157143 | 59.714286 | 10.428571 | 32.714286 | 7.300000 | 342.571429 | 49.000000 | 195.571429 | 36.214286 | 31.285714 | 6.714286 | 300.000000 | 48.471429 | 305.857143 | 51.528571 | 82.857143 | 79.285714 | 40.428571 | 442.428571 | 27.714286 | 184.857143 | 8.085714 | 7.342857 | 7.357143 | 7.685714 | 6.857143 | 5.885714 | 6.242857 | 6.957143 | 6.314286 | 6.000000 | 6.428571 | 7.285714 | 7.085714 | 6.414286 | 6.671429 | 7.314286 | 12.0 | 12.0 | 10030.142857 | 609.857143 | 1.142857 | 0.0 | 0.0 | 0.0 | 40.817077 | -73.949251 |
5 | 06 | 382.011940 | 400.565672 | 382.066269 | 1164.643881 | 108.711538 | 159.715385 | 105.425000 | 180.848387 | 6.0 | 139.041709 | 5.285860 | 25.324199 | 20.556126 | 29.217872 | 0.0 | 20112012.0 | 0.0 | 84.250000 | 621.800000 | 165.700000 | 30.040000 | 80.100000 | 14.950000 | 7.800000 | 0.810000 | 82.000000 | 11.470000 | 523.900000 | 86.380000 | 5.000000 | 0.720000 | 333.600000 | 53.790000 | 288.200000 | 46.210000 | 85.000000 | 83.500000 | 54.600000 | 482.300000 | 33.000000 | 261.500000 | 8.530000 | 7.990000 | 7.750000 | 8.050000 | 7.490000 | 6.610000 | 7.060000 | 7.660000 | 6.950000 | 6.150000 | 6.700000 | 7.510000 | 7.660000 | 6.920000 | 7.170000 | 7.760000 | 12.0 | 12.0 | 10036.200000 | 628.900000 | 1.300000 | 0.0 | 0.0 | 0.0 | 40.848970 | -73.932502 |
6 | 07 | 376.461538 | 380.461538 | 371.923077 | 1128.846154 | 73.703402 | 112.476331 | 105.276923 | 105.605459 | 7.0 | 97.597416 | 4.146610 | 23.560037 | 19.394465 | 27.004052 | 0.0 | 20112012.0 | 0.0 | 79.630769 | 481.000000 | 72.000000 | 16.661538 | 84.846154 | 17.446154 | 8.000000 | 1.707692 | 144.461538 | 29.253846 | 322.000000 | 67.607692 | 3.769231 | 0.800000 | 261.769231 | 53.138462 | 219.230769 | 46.861538 | 80.538462 | 77.153846 | 42.923077 | 376.230769 | 25.846154 | 187.615385 | 8.423077 | 7.761538 | 7.600000 | 7.907692 | 6.969231 | 6.315385 | 6.784615 | 7.284615 | 6.800000 | 6.184615 | 6.646154 | 7.453846 | 7.392308 | 6.746154 | 7.007692 | 7.553846 | 12.0 | 12.0 | 10452.692308 | 465.846154 | 1.461538 | 0.0 | 0.0 | 0.0 | 40.816815 | -73.919971 |
7 | 08 | 386.214383 | 395.542741 | 377.908005 | 1159.665129 | 118.379371 | 168.020979 | 144.731818 | 215.510264 | 8.0 | 129.765099 | 5.165710 | 23.813192 | 19.385355 | 27.362141 | 0.0 | 20112012.0 | 0.0 | 76.663636 | 684.545455 | 72.909091 | 15.790909 | 140.545455 | 19.609091 | 30.272727 | 2.809091 | 179.818182 | 27.809091 | 433.545455 | 66.109091 | 36.181818 | 2.554545 | 355.000000 | 45.454545 | 329.545455 | 54.545455 | 69.454545 | 77.636364 | 36.636364 | 459.837121 | 34.000000 | 148.545455 | 8.172727 | 7.636364 | 7.454545 | 7.718182 | 7.109091 | 6.254545 | 6.781818 | 7.390909 | 6.364697 | 5.954066 | 6.401843 | 7.143763 | 7.218182 | 6.645455 | 6.863636 | 7.409091 | 12.0 | 12.0 | 10467.000000 | 547.636364 | 1.272727 | 0.0 | 0.0 | 0.0 | 40.823803 | -73.866087 |
8 | 09 | 373.755970 | 383.582836 | 374.633134 | 1131.971940 | 71.411538 | 104.265385 | 98.470000 | 113.330645 | 9.0 | 100.118588 | 4.335504 | 23.986379 | 19.543005 | 28.447128 | 0.0 | 20112012.0 | 0.0 | 76.800000 | 442.800000 | 78.900000 | 19.945000 | 74.150000 | 16.655000 | 7.000000 | 1.480000 | 151.900000 | 33.110000 | 278.250000 | 64.135000 | 3.000000 | 0.665000 | 220.700000 | 48.995000 | 222.050000 | 50.995000 | 84.200000 | 84.650000 | 40.650000 | 349.150000 | 25.450000 | 162.300000 | 8.510000 | 7.930000 | 7.755000 | 8.080000 | 6.935000 | 6.305000 | 6.905000 | 7.355000 | 6.655000 | 6.145000 | 6.705000 | 7.490000 | 7.365000 | 6.795000 | 7.110000 | 7.665000 | 12.0 | 12.0 | 10456.100000 | 449.700000 | 1.150000 | 0.0 | 0.0 | 0.0 | 40.836349 | -73.906240 |
9 | 10 | 403.363636 | 418.000000 | 400.863636 | 1222.227273 | 132.231206 | 226.914336 | 191.618182 | 161.318182 | 10.0 | 168.876526 | 6.512691 | 24.244350 | 18.897297 | 28.583860 | 0.0 | 20112012.0 | 0.0 | 72.268182 | 818.545455 | 129.090909 | 18.177273 | 99.045455 | 13.622727 | 119.681818 | 6.631818 | 169.090909 | 23.027273 | 453.545455 | 63.272727 | 71.409091 | 6.527273 | 413.181818 | 48.586364 | 405.363636 | 51.413636 | 82.727273 | 82.136364 | 43.000000 | 648.772727 | 43.500000 | 303.590909 | 8.272727 | 7.768182 | 7.622727 | 7.927273 | 6.972727 | 6.154545 | 6.700000 | 7.309091 | 6.577273 | 6.040909 | 6.654545 | 7.409091 | 7.268182 | 6.645455 | 6.986364 | 7.536364 | 12.0 | 12.0 | 10463.181818 | 757.863636 | 1.500000 | 0.0 | 0.0 | 0.0 | 40.870345 | -73.898360 |
10 | 11 | 389.866667 | 394.533333 | 380.600000 | 1165.000000 | 83.813462 | 122.484615 | 108.833333 | 122.866667 | 11.0 | 129.031031 | 5.037052 | 24.956523 | 19.784469 | 29.523549 | 0.0 | 20112012.0 | 0.0 | 66.960000 | 562.466667 | 61.666667 | 12.133333 | 105.600000 | 17.380000 | 19.800000 | 3.846667 | 243.000000 | 41.346667 | 278.266667 | 50.846667 | 16.400000 | 3.113333 | 319.866667 | 55.900000 | 242.600000 | 44.100000 | 80.400000 | 79.933333 | 42.466667 | 400.733333 | 27.466667 | 190.800000 | 8.053333 | 7.693333 | 7.546667 | 7.826667 | 7.026667 | 6.766667 | 7.000000 | 7.540000 | 6.186667 | 5.746667 | 6.260000 | 7.213333 | 7.086667 | 6.740000 | 6.940000 | 7.526667 | 12.0 | 12.0 | 10467.933333 | 563.666667 | 1.533333 | 0.0 | 0.0 | 0.0 | 40.873138 | -73.856120 |
11 | 12 | 364.769900 | 379.109453 | 357.943781 | 1101.823134 | 93.102564 | 139.442308 | 153.450000 | 110.467742 | 12.0 | 91.684504 | 4.151865 | 22.212078 | 18.487312 | 25.804834 | 0.0 | 20112012.0 | 0.0 | 81.733333 | 391.000000 | 105.500000 | 27.141667 | 70.666667 | 17.216667 | 5.500000 | 1.591667 | 109.750000 | 28.075000 | 271.750000 | 69.275000 | 3.000000 | 0.825000 | 199.916667 | 50.133333 | 191.083333 | 49.866667 | 80.000000 | 79.750000 | 49.166667 | 277.333333 | 21.166667 | 163.083333 | 8.650000 | 8.008333 | 7.808333 | 8.166667 | 7.266667 | 6.758333 | 7.250000 | 7.816667 | 7.116667 | 6.566667 | 7.116667 | 7.783333 | 7.666667 | 7.100000 | 7.391667 | 7.916667 | 12.0 | 12.0 | 10463.166667 | 409.000000 | 1.083333 | 0.0 | 0.0 | 0.0 | 40.831412 | -73.886946 |
12 | 13 | 409.393800 | 424.127440 | 403.666361 | 1237.187600 | 232.931953 | 382.704142 | 320.773077 | 224.595533 | 13.0 | 218.306055 | 8.003842 | 24.621784 | 19.625241 | 28.672265 | 0.0 | 20112012.0 | 0.0 | 63.861538 | 902.000000 | 35.461538 | 8.723077 | 65.692308 | 12.084615 | 272.461538 | 10.453846 | 416.000000 | 68.476923 | 110.000000 | 16.276923 | 96.230769 | 3.623077 | 482.769231 | 48.707692 | 419.230769 | 51.292308 | 84.153846 | 77.615385 | 35.769231 | 759.846154 | 40.000000 | 318.384615 | 8.330769 | 7.700000 | 7.692308 | 7.884615 | 7.092308 | 6.830769 | 7.069231 | 7.600000 | 6.407692 | 6.069231 | 6.576923 | 7.376923 | 7.269231 | 6.869231 | 7.123077 | 7.607692 | 12.0 | 12.0 | 11207.153846 | 895.153846 | 2.076923 | 0.0 | 0.0 | 0.0 | 40.692865 | -73.977016 |
13 | 14 | 395.937100 | 398.189765 | 385.333049 | 1179.459915 | 77.798077 | 114.873626 | 123.282143 | 112.347926 | 14.0 | 123.643728 | 4.905127 | 24.311291 | 19.554945 | 28.511594 | 0.0 | 20112012.0 | 0.0 | 75.928571 | 556.214286 | 57.785714 | 9.485714 | 97.714286 | 18.600000 | 19.500000 | 3.871429 | 222.428571 | 42.228571 | 291.928571 | 49.664286 | 18.714286 | 3.507143 | 324.714286 | 58.050000 | 231.500000 | 41.942857 | 77.714286 | 85.500000 | 35.071429 | 395.642857 | 32.428571 | 158.214286 | 8.357143 | 7.864286 | 7.735714 | 7.971429 | 7.271429 | 6.650000 | 7.257143 | 7.735714 | 6.685714 | 6.207143 | 6.742857 | 7.507143 | 7.435714 | 6.914286 | 7.250000 | 7.750000 | 12.0 | 12.0 | 11210.785714 | 545.357143 | 2.000000 | 0.0 | 0.0 | 0.0 | 40.711599 | -73.948360 |
14 | 15 | 395.679934 | 404.628524 | 390.295854 | 1190.604312 | 94.574786 | 141.581197 | 153.450000 | 104.207885 | 15.0 | 135.707319 | 5.236588 | 25.423643 | 20.900953 | 29.287015 | 0.0 | 20112012.0 | 0.0 | 70.800000 | 562.666667 | 48.888889 | 8.844444 | 103.333333 | 17.866667 | 26.333333 | 4.966667 | 251.777778 | 47.000000 | 246.111111 | 41.266667 | 36.444444 | 6.377778 | 270.666667 | 48.211111 | 292.000000 | 51.788889 | 81.444444 | 82.888889 | 32.333333 | 433.000000 | 34.777778 | 170.000000 | 8.077778 | 7.577778 | 7.477778 | 7.844444 | 6.566667 | 5.877778 | 6.288889 | 7.011111 | 6.177778 | 5.900000 | 6.333333 | 7.222222 | 6.933333 | 6.444444 | 6.700000 | 7.366667 | 12.0 | 12.0 | 11214.222222 | 573.111111 | 1.666667 | 0.0 | 0.0 | 0.0 | 40.675972 | -73.989255 |
15 | 16 | 371.529851 | 379.164179 | 369.415672 | 1120.109701 | 82.264423 | 126.519231 | 153.450000 | 247.185484 | 16.0 | 177.501282 | 6.970513 | 24.234006 | 18.070192 | 29.308333 | 0.0 | 20112012.0 | 0.0 | 66.150000 | 658.500000 | 18.750000 | 3.725000 | 131.750000 | 19.775000 | 4.750000 | 1.075000 | 562.000000 | 80.150000 | 80.750000 | 16.900000 | 4.500000 | 0.925000 | 377.750000 | 53.400000 | 280.750000 | 46.600000 | 68.500000 | 75.250000 | 19.500000 | 458.000000 | 31.500000 | 122.000000 | 7.375000 | 6.900000 | 6.875000 | 7.075000 | 5.750000 | 5.250000 | 6.025000 | 6.275000 | 5.875000 | 5.775000 | 6.225000 | 7.100000 | 6.325000 | 5.975000 | 6.375000 | 6.825000 | 12.0 | 12.0 | 11219.000000 | 440.250000 | 1.750000 | 0.0 | 0.0 | 0.0 | 40.688008 | -73.929686 |
16 | 17 | 386.571429 | 394.071429 | 380.785714 | 1161.428571 | 105.583791 | 163.087912 | 111.360714 | 121.357143 | 17.0 | 130.246192 | 5.135694 | 25.408854 | 20.284341 | 29.620046 | 0.0 | 20112012.0 | 0.0 | 67.635714 | 554.285714 | 52.357143 | 11.285714 | 54.571429 | 10.621429 | 16.857143 | 3.607143 | 474.285714 | 83.071429 | 50.642857 | 10.664286 | 9.000000 | 2.050000 | 242.857143 | 48.000000 | 311.428571 | 52.000000 | 78.785714 | 77.714286 | 41.214286 | 426.428571 | 27.357143 | 209.857143 | 7.835714 | 7.642857 | 7.550000 | 7.828571 | 6.957143 | 6.600000 | 7.192857 | 7.642857 | 6.007143 | 5.885714 | 6.400000 | 7.228571 | 6.935714 | 6.707143 | 7.035714 | 7.571429 | 12.0 | 12.0 | 11220.642857 | 547.071429 | 1.642857 | 0.0 | 0.0 | 0.0 | 40.660313 | -73.955636 |
17 | 18 | 373.454545 | 373.090909 | 371.454545 | 1118.000000 | 129.028846 | 197.038462 | 153.450000 | 72.771261 | 18.0 | 72.209438 | 3.317955 | 22.102538 | 18.814864 | 25.448084 | 0.0 | 20112012.0 | 0.0 | 65.900000 | 332.272727 | 29.545455 | 8.754545 | 53.363636 | 16.363636 | 4.181818 | 1.236364 | 290.272727 | 87.327273 | 32.363636 | 9.718182 | 3.727273 | 1.163636 | 164.909091 | 51.227273 | 167.363636 | 48.772727 | 78.181818 | 83.363636 | 31.454545 | 222.181818 | 18.454545 | 89.181818 | 8.272727 | 7.718182 | 7.572727 | 7.963636 | 6.890909 | 6.400000 | 6.936364 | 7.390909 | 6.190909 | 6.018182 | 6.409091 | 7.209091 | 7.109091 | 6.718182 | 6.963636 | 7.527273 | 12.0 | 12.0 | 11224.000000 | 344.000000 | 1.090909 | 0.0 | 0.0 | 0.0 | 40.641863 | -73.914726 |
18 | 19 | 367.083333 | 377.583333 | 359.166667 | 1103.833333 | 88.097756 | 124.769231 | 120.670833 | 114.322581 | 19.0 | 105.752625 | 4.247924 | 24.314442 | 19.921011 | 28.148996 | 0.0 | 20112012.0 | 0.0 | 71.833333 | 492.500000 | 62.833333 | 14.941667 | 75.666667 | 15.291667 | 15.750000 | 3.191667 | 285.000000 | 55.066667 | 182.666667 | 39.916667 | 5.833333 | 1.150000 | 284.500000 | 53.725000 | 208.000000 | 46.275000 | 70.833333 | 89.166667 | 34.083333 | 311.916667 | 28.500000 | 141.333333 | 7.591667 | 7.575000 | 7.375000 | 7.658333 | 6.516667 | 6.400000 | 6.766667 | 7.308333 | 6.150000 | 5.933333 | 6.458333 | 7.141667 | 6.758333 | 6.641667 | 6.866667 | 7.350000 | 12.0 | 12.0 | 11207.500000 | 440.416667 | 1.916667 | 0.0 | 0.0 | 0.0 | 40.676547 | -73.882158 |
19 | 20 | 406.223881 | 465.731343 | 401.732537 | 1273.687761 | 227.805769 | 359.407692 | 177.690000 | 591.374194 | 20.0 | 420.029766 | 14.721019 | 25.423366 | 17.681562 | 29.745831 | 0.0 | 20112012.0 | 0.0 | 65.840000 | 2462.600000 | 601.400000 | 20.800000 | 341.200000 | 15.140000 | 844.600000 | 32.920000 | 186.800000 | 12.620000 | 766.000000 | 32.300000 | 658.200000 | 21.840000 | 1345.200000 | 44.920000 | 1117.400000 | 55.080000 | 64.400000 | 78.200000 | 25.000000 | 1360.200000 | 107.000000 | 602.400000 | 7.920000 | 7.180000 | 7.240000 | 7.460000 | 7.820000 | 7.660000 | 7.800000 | 8.280000 | 7.120000 | 6.340000 | 7.040000 | 7.640000 | 7.620000 | 7.080000 | 7.380000 | 7.800000 | 12.0 | 12.0 | 11210.200000 | 2521.400000 | 3.800000 | 0.0 | 0.0 | 0.0 | 40.626751 | -74.006191 |
20 | 21 | 395.283582 | 421.786974 | 389.242062 | 1206.312619 | 135.467657 | 203.835664 | 142.377273 | 275.351906 | 21.0 | 224.702989 | 8.191627 | 25.284294 | 20.167110 | 28.885220 | 0.0 | 20112012.0 | 0.0 | 62.227273 | 1160.000000 | 160.909091 | 17.163636 | 157.363636 | 14.454545 | 262.090909 | 17.272727 | 364.909091 | 33.409091 | 230.454545 | 23.409091 | 297.454545 | 25.527273 | 616.090909 | 58.172727 | 543.909091 | 41.827273 | 79.272727 | 89.181818 | 34.545455 | 930.545455 | 64.454545 | 336.181818 | 7.818182 | 7.336364 | 7.263636 | 7.527273 | 7.072727 | 6.400000 | 6.963636 | 7.472727 | 6.390909 | 6.081818 | 6.500000 | 7.163636 | 7.090909 | 6.609091 | 6.909091 | 7.390909 | 12.0 | 12.0 | 11221.000000 | 1098.272727 | 3.272727 | 0.0 | 0.0 | 0.0 | 40.593596 | -73.978465 |
21 | 22 | 473.500000 | 502.750000 | 474.250000 | 1450.500000 | 391.007212 | 614.509615 | 370.362500 | 580.250000 | 22.0 | 495.279369 | 15.777391 | 30.077636 | 22.509675 | 33.464740 | 0.0 | 20112012.0 | 0.0 | 44.225000 | 2142.500000 | 138.250000 | 4.400000 | 166.250000 | 6.650000 | 507.750000 | 17.975000 | 622.250000 | 35.100000 | 271.000000 | 12.350000 | 732.000000 | 33.975000 | 1015.750000 | 44.300000 | 1126.750000 | 55.700000 | 90.250000 | 87.500000 | 48.750000 | 1887.250000 | 89.500000 | 1106.250000 | 8.300000 | 7.325000 | 7.525000 | 7.775000 | 8.175000 | 7.400000 | 7.600000 | 8.200000 | 7.025000 | 6.050000 | 6.775000 | 7.500000 | 7.850000 | 6.900000 | 7.300000 | 7.850000 | 12.0 | 12.0 | 11223.000000 | 2149.000000 | 2.250000 | 0.0 | 0.0 | 0.0 | 40.618285 | -73.952288 |
22 | 23 | 380.666667 | 398.666667 | 378.000000 | 1157.333333 | 29.000000 | 31.000000 | 153.450000 | 87.000000 | 23.0 | 120.113095 | 4.985119 | 24.299702 | 18.654762 | 28.994048 | 0.0 | 20112012.0 | 0.0 | 70.400000 | 503.666667 | 8.000000 | 1.600000 | 68.666667 | 14.466667 | 6.000000 | 1.133333 | 438.000000 | 87.133333 | 53.333333 | 10.600000 | 3.333333 | 0.566667 | 221.333333 | 44.433333 | 282.333333 | 55.566667 | 84.000000 | 73.666667 | 40.666667 | 420.333333 | 25.333333 | 175.000000 | 7.900000 | 7.533333 | 7.566667 | 7.700000 | 7.033333 | 6.433333 | 6.666667 | 7.366667 | 6.233333 | 6.066667 | 6.766667 | 7.566667 | 7.066667 | 6.700000 | 7.000000 | 7.566667 | 12.0 | 12.0 | 11219.000000 | 391.000000 | 1.333333 | 0.0 | 0.0 | 0.0 | 40.668586 | -73.912298 |
23 | 24 | 405.846154 | 434.000000 | 402.153846 | 1242.000000 | 126.474852 | 179.094675 | 115.165385 | 234.682382 | 24.0 | 213.471903 | 8.613529 | 23.865936 | 18.565040 | 28.251471 | 0.0 | 20112012.0 | 0.0 | 62.584615 | 991.000000 | 179.769231 | 21.069231 | 97.076923 | 9.607692 | 182.692308 | 17.176923 | 74.769231 | 7.800000 | 616.615385 | 63.415385 | 113.461538 | 11.284615 | 572.769231 | 51.646154 | 418.230769 | 48.353846 | 83.846154 | 83.230769 | 35.692308 | 765.461538 | 47.769231 | 247.692308 | 8.461538 | 7.684615 | 7.569231 | 7.869231 | 7.669231 | 6.569231 | 7.230769 | 7.723077 | 7.107692 | 6.169231 | 6.784615 | 7.500000 | 7.753846 | 6.807692 | 7.192308 | 7.707692 | 12.0 | 12.0 | 11206.153846 | 962.461538 | 2.230769 | 0.0 | 0.0 | 0.0 | 40.740621 | -73.911518 |
24 | 25 | 437.250000 | 483.500000 | 436.250000 | 1357.000000 | 205.260817 | 279.889423 | 174.793750 | 268.733871 | 25.0 | 280.576007 | 9.990385 | 27.106270 | 21.213189 | 31.371220 | 0.0 | 20112012.0 | 0.0 | 52.950000 | 1356.750000 | 257.500000 | 19.737500 | 136.750000 | 9.975000 | 465.250000 | 40.600000 | 240.375000 | 13.062500 | 494.000000 | 29.237500 | 148.875000 | 16.537500 | 669.500000 | 48.500000 | 687.250000 | 51.500000 | 87.000000 | 78.500000 | 37.625000 | 1028.250000 | 56.000000 | 370.875000 | 8.175000 | 7.537500 | 7.412500 | 7.687500 | 7.487500 | 6.825000 | 7.200000 | 7.712500 | 6.912500 | 6.150000 | 6.712500 | 7.400000 | 7.525000 | 6.837500 | 7.112500 | 7.612500 | 12.0 | 12.0 | 11361.000000 | 1288.875000 | 1.875000 | 0.0 | 0.0 | 0.0 | 40.745414 | -73.815558 |
25 | 26 | 445.200000 | 487.600000 | 444.800000 | 1377.600000 | 410.605769 | 632.407692 | 392.090000 | 825.600000 | 26.0 | 595.953216 | 19.240117 | 29.971140 | 20.315789 | 33.554327 | 0.0 | 20112012.0 | 0.0 | 43.800000 | 2991.600000 | 248.600000 | 7.480000 | 350.000000 | 11.960000 | 1259.200000 | 38.140000 | 676.400000 | 28.420000 | 604.800000 | 19.380000 | 432.400000 | 13.420000 | 1475.000000 | 49.220000 | 1516.600000 | 50.780000 | 67.000000 | 83.400000 | 23.800000 | 1930.800000 | 129.800000 | 684.200000 | 7.720000 | 6.980000 | 7.220000 | 7.260000 | 7.000000 | 6.580000 | 6.840000 | 7.260000 | 6.760000 | 6.060000 | 6.660000 | 7.380000 | 7.140000 | 6.540000 | 6.900000 | 7.300000 | 12.0 | 12.0 | 11388.600000 | 2837.400000 | 4.600000 | 0.0 | 0.0 | 0.0 | 40.748507 | -73.759176 |
26 | 27 | 407.800000 | 422.200000 | 394.300000 | 1224.300000 | 100.611538 | 145.315385 | 95.125000 | 288.961290 | 27.0 | 249.324536 | 9.019043 | 26.649659 | 20.757535 | 30.938081 | 0.0 | 20112012.0 | 0.0 | 61.340000 | 1148.200000 | 117.800000 | 6.820000 | 134.600000 | 11.950000 | 263.100000 | 16.890000 | 365.700000 | 39.760000 | 394.100000 | 31.380000 | 103.900000 | 10.750000 | 610.700000 | 52.400000 | 537.500000 | 47.600000 | 76.900000 | 86.500000 | 40.900000 | 775.100000 | 59.200000 | 276.800000 | 7.830000 | 7.400000 | 7.370000 | 7.640000 | 6.930000 | 6.780000 | 7.060000 | 7.630000 | 6.390000 | 6.000000 | 6.500000 | 7.310000 | 7.010000 | 6.710000 | 6.950000 | 7.510000 | 12.0 | 12.0 | 11556.300000 | 1072.000000 | 2.500000 | 0.0 | 0.0 | 0.0 | 40.638828 | -73.807823 |
27 | 28 | 445.941655 | 465.997286 | 435.908005 | 1347.846947 | 182.010490 | 273.559441 | 175.336364 | 351.214076 | 28.0 | 255.381164 | 8.719058 | 26.770807 | 21.349816 | 30.172038 | 0.0 | 20112012.0 | 0.0 | 53.790909 | 1210.636364 | 87.363636 | 4.963636 | 123.454545 | 9.827273 | 403.727273 | 31.345455 | 343.727273 | 37.509091 | 283.454545 | 19.954545 | 165.636364 | 9.400000 | 590.363636 | 45.863636 | 620.272727 | 54.136364 | 85.090909 | 85.545455 | 40.545455 | 872.545455 | 54.727273 | 362.636364 | 8.172727 | 7.590909 | 7.472727 | 7.781818 | 7.600000 | 7.145455 | 7.363636 | 7.972727 | 6.627273 | 6.081818 | 6.618182 | 7.454545 | 7.463636 | 6.936364 | 7.145455 | 7.745455 | 12.0 | 12.0 | 11422.000000 | 1304.272727 | 2.545455 | 0.0 | 0.0 | 0.0 | 40.709344 | -73.806367 |
28 | 29 | 395.764925 | 399.457090 | 386.707836 | 1181.929851 | 63.385817 | 96.514423 | 135.268750 | 98.108871 | 29.0 | 88.372155 | 3.454193 | 25.355632 | 21.409201 | 28.672730 | 0.0 | 20112012.0 | 0.0 | 52.362500 | 428.250000 | 12.625000 | 2.912500 | 56.125000 | 13.037500 | 20.750000 | 4.825000 | 359.875000 | 83.987500 | 39.000000 | 9.150000 | 4.125000 | 1.050000 | 219.875000 | 51.862500 | 208.375000 | 48.137500 | 77.625000 | 90.625000 | 44.750000 | 294.500000 | 21.375000 | 146.500000 | 7.762500 | 7.362500 | 7.325000 | 7.562500 | 7.050000 | 6.912500 | 7.162500 | 7.575000 | 6.075000 | 5.962500 | 6.487500 | 7.250000 | 6.962500 | 6.737500 | 7.000000 | 7.475000 | 12.0 | 12.0 | 11413.625000 | 474.125000 | 1.250000 | 0.0 | 0.0 | 0.0 | 40.685276 | -73.752740 |
29 | 30 | 430.679934 | 465.961857 | 429.740299 | 1326.382090 | 157.231838 | 252.123932 | 115.150000 | 310.526882 | 30.0 | 251.803744 | 9.260486 | 25.715644 | 19.312273 | 29.760726 | 0.0 | 20112012.0 | 0.0 | 54.211111 | 1221.555556 | 220.555556 | 16.133333 | 125.111111 | 7.988889 | 281.777778 | 24.866667 | 124.222222 | 10.444444 | 607.222222 | 43.522222 | 204.444444 | 20.788889 | 604.111111 | 44.777778 | 617.444444 | 55.222222 | 89.000000 | 85.444444 | 42.333333 | 963.444444 | 63.111111 | 318.777778 | 8.222222 | 7.444444 | 7.433333 | 7.700000 | 7.500000 | 6.655556 | 6.911111 | 7.533333 | 7.033333 | 6.166667 | 6.844444 | 7.511111 | 7.577778 | 6.744444 | 7.077778 | 7.588889 | 12.0 | 12.0 | 11103.000000 | 1123.333333 | 2.555556 | 0.0 | 0.0 | 0.0 | 40.755398 | -73.932306 |
30 | 31 | 457.500000 | 472.500000 | 452.500000 | 1382.500000 | 228.908654 | 355.111538 | 194.435000 | 450.787097 | 31.0 | 380.528319 | 13.251284 | 28.119729 | 19.995494 | 32.785641 | 0.0 | 20112012.0 | 0.0 | 38.310000 | 1850.600000 | 64.200000 | 2.860000 | 301.500000 | 16.710000 | 177.700000 | 10.300000 | 307.300000 | 18.270000 | 418.700000 | 22.170000 | 938.700000 | 48.810000 | 948.500000 | 52.090000 | 902.100000 | 47.910000 | 88.900000 | 91.300000 | 43.500000 | 1476.600000 | 88.900000 | 593.300000 | 7.800000 | 7.390000 | 7.460000 | 7.620000 | 7.210000 | 7.140000 | 7.390000 | 7.860000 | 6.530000 | 6.070000 | 6.780000 | 7.330000 | 7.200000 | 6.870000 | 7.200000 | 7.610000 | 12.0 | 12.0 | 10307.100000 | 1847.500000 | 5.000000 | 0.0 | 0.0 | 0.0 | 40.595680 | -74.125726 |
31 | 32 | 371.500000 | 385.833333 | 362.166667 | 1119.500000 | 70.342949 | 100.179487 | 83.558333 | 105.333333 | 32.0 | 100.525613 | 4.450572 | 22.479804 | 17.936520 | 26.816013 | 0.0 | 20112012.0 | 0.0 | 82.866667 | 420.833333 | 79.500000 | 18.533333 | 67.500000 | 15.650000 | 6.333333 | 1.750000 | 82.000000 | 19.833333 | 326.166667 | 76.850000 | 3.833333 | 0.966667 | 217.833333 | 51.716667 | 203.000000 | 48.283333 | 75.666667 | 72.333333 | 34.666667 | 312.833333 | 22.333333 | 137.833333 | 8.350000 | 8.050000 | 7.883333 | 8.216667 | 7.050000 | 6.166667 | 6.900000 | 7.483333 | 6.766667 | 6.033333 | 6.783333 | 7.483333 | 7.383333 | 6.750000 | 7.200000 | 7.733333 | 12.0 | 12.0 | 11231.666667 | 381.500000 | 1.000000 | 0.0 | 0.0 | 0.0 | 40.696295 | -73.917124 |
import numpy as np
from mpl_toolkits.basemap import Basemap
m = Basemap(
projection='merc',
llcrnrlat=40.496044,
urcrnrlat=40.915256,
llcrnrlon=-74.255735,
urcrnrlon=-73.700272,
resolution='f'
)
m.drawmapboundary(fill_color='#ABD1FF')
m.drawcoastlines(color='black', linewidth=.4)
m.drawrivers(color='#ABD1FF', linewidth=.4)
m.fillcontinents(color= '#F2F1EF',lake_color='#ABD1FF')
longitudes = districts['lon'].tolist()
latitudes = districts['lat'].tolist()
m.scatter(longitudes, latitudes, s=50, zorder=2, latlon=True, c=districts['saf_tot_11'], cmap='viridis')
plt.axis(aspect='equal')
plt.xlabel('longitude')
plt.ylabel('latitude')
plt.colorbar(label='Safety Score')
plt.title('New York City: Schools Safety Score')
plt.savefig('new_york_schools.png')
plt.show()
From the Map, we can observe that parts of Manhattan, Bronx and Queens have relatively higher safety scores. While Brooklyn has mostly lower safety scores.
There are a few columns that indicate the percentage of each race at a given school:
white_per
asian_per
black_per
hispanic_per
By plotting out the correlations between these columns and sat_score, we can determine whether there are any racial differences in SAT performance. We will investigate racial differences in SAT scores by making a bar plot of the correlations between the columns above and sat_score
.
races_per = ['white_per', 'asian_per', 'black_per', 'hispanic_per']
combined.corr()['sat_score'][races_per].plot.bar()
plt.show()
From the bar plot above we can notice that higher percentage of White or Asian students at a school correlates with higher SAT Scores. And vice versa for Black and Hispanic Students. It can be due to various external factors, like economic background of students, funding to Schools, etc.
We will continue exploration regarding race and SAT scores by makign a scatter plot of hispanic_per vs. sat_score.
combined.plot.scatter(x='hispanic_per', y='sat_score')
plt.show()
There seems to be a negative correlation between hispanic students percentage and SAT Score. This may be due to a lack of funding for schools in certain areas, which are more likely to have a higher percentage of black or hispanic students.
Filtering down on hispanic population by researching schools with a hispanic_per greater than 95%.
combined[combined['hispanic_per']>95]['SCHOOL NAME']
44 MANHATTAN BRIDGES HIGH SCHOOL 82 WASHINGTON HEIGHTS EXPEDITIONARY LEARNING SCHOOL 89 GREGORIO LUPERON HIGH SCHOOL FOR SCIENCE AND M... 125 ACADEMY FOR LANGUAGE AND TECHNOLOGY 141 INTERNATIONAL SCHOOL FOR LIBERAL ARTS 176 PAN AMERICAN INTERNATIONAL HIGH SCHOOL AT MONROE 253 MULTICULTURAL HIGH SCHOOL 286 PAN AMERICAN INTERNATIONAL HIGH SCHOOL Name: SCHOOL NAME, dtype: object
We can observe from our findings that the above schools are catering mostly to the immigrant community, who have recently migrated to USA from neighboring hispanic countries. And are learning English, which can be correlated to lower SAT Scores.
Researching schools with a hispanic_per less than 10% and an average SAT score greater than 1800.
combined[(combined['hispanic_per'] < 10) & (combined['sat_score'] > 1800)]['SCHOOL NAME']
37 STUYVESANT HIGH SCHOOL 151 BRONX HIGH SCHOOL OF SCIENCE 187 BROOKLYN TECHNICAL HIGH SCHOOL 327 QUEENS HIGH SCHOOL FOR THE SCIENCES AT YORK CO... 356 STATEN ISLAND TECHNICAL HIGH SCHOOL Name: SCHOOL NAME, dtype: object
From our findings, we observe that the above schools are sepcialized in Science & Technology. Plus, have a criteria (clearing specific entrance exams) for admitting students, which explaining the High SAT Scores. But, there is no specific reason for low Hispanic proportion of students.
There are two columns that indicate the percentage of each gender at a school:
We can plot out the correlations between each percentage and sat_score by making a bar plot.
gender_per = ['male_per', 'female_per']
combined.corr()['sat_score'][gender_per].plot.bar()
plt.show()
Here, from the bar plot we can observe that higher percentage of Female Students correlates positively with SAT Scores and vice versa for Male Students. However neither of the correlations are strong.
Investigating schools with high SAT scores and a high female_per by making a scatter plot of female_per vs. sat_score.
combined.plot.scatter(x='female_per', y='sat_score')
plt.show()
From the scatter plot above, there doesn't seem to be any strong correlation among the SAT Scores and Female Percentage.
Researching schools with a female_per greater than 60% and an average SAT score greater than 1700.
combined[(combined['sat_score'] > 1700) & (combined['female_per'] > 60)]['SCHOOL NAME']
5 BARD HIGH SCHOOL EARLY COLLEGE 26 ELEANOR ROOSEVELT HIGH SCHOOL 60 BEACON HIGH SCHOOL 61 FIORELLO H. LAGUARDIA HIGH SCHOOL OF MUSIC & A... 302 TOWNSEND HARRIS HIGH SCHOOL Name: SCHOOL NAME, dtype: object
From our findings, we notice that the above schools have their major focus primarily on two things:
- Arts
- Preparing students for college
These appear to be very selective liberal arts schools that have high academic standards.
There isn't enough evidence to claim that the SAT scores are unfair to certain groups. On surface, it looks like the SAT Scores are unfair. But on further investigation, we can see that there are multiple other factors (like immigration, funding, etc.) which may influence academic performance.