One of the most controversial issues in the U.S. educational system is the efficacy of standardized tests, and whether they're unfair to certain groups. Given our prior knowledge of this topic, investigating the correlations between SAT scores and demographics might be an interesting angle to take. We could correlate SAT scores with factors like race, gender, income, and more.
The test consists of three sections, each of which has 800 possible points. The combined score is out of 2,400 possible points (while this number has changed a few times, the data set for our project is based on 2,400 total points). Organizations often rank high schools by their average SAT scores. The scores are also considered a measure of overall school district quality.
The data used here has been cleaned and exported from a different project here.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
combined = pd.read_csv("combined.csv")
combined.head()
DBN | SCHOOL NAME | Num of SAT Test Takers | SAT Critical Reading Avg. Score | SAT Math Avg. Score | SAT Writing Avg. Score | sat_score | SchoolName | AP Test Takers | Total Exams Taken | ... | priority05 | priority06 | priority07 | priority08 | priority09 | priority10 | Location 1 | lat | lon | school_dist | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 01M292 | HENRY STREET SCHOOL FOR INTERNATIONAL STUDIES | 29 | 355.0 | 404.0 | 363.0 | 1122.0 | 0 | 129.028846 | 197.038462 | ... | Then to New York City residents | 0 | 0 | 0 | 0 | 0 | 220 Henry Street\nNew York, NY 10002\n(40.7137... | 40.713764 | -73.985260 | 1 |
1 | 01M448 | UNIVERSITY NEIGHBORHOOD HIGH SCHOOL | 91 | 383.0 | 423.0 | 366.0 | 1172.0 | UNIVERSITY NEIGHBORHOOD H.S. | 39.000000 | 49.000000 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 200 Monroe Street\nNew York, NY 10002\n(40.712... | 40.712332 | -73.984797 | 1 |
2 | 01M450 | EAST SIDE COMMUNITY SCHOOL | 70 | 377.0 | 402.0 | 370.0 | 1149.0 | EAST SIDE COMMUNITY HS | 19.000000 | 21.000000 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 420 East 12 Street\nNew York, NY 10009\n(40.72... | 40.729783 | -73.983041 | 1 |
3 | 01M509 | MARTA VALLE HIGH SCHOOL | 44 | 390.0 | 433.0 | 384.0 | 1207.0 | 0 | 129.028846 | 197.038462 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 145 Stanton Street\nNew York, NY 10002\n(40.72... | 40.720569 | -73.985673 | 1 |
4 | 01M539 | NEW EXPLORATIONS INTO SCIENCE, TECHNOLOGY AND ... | 159 | 522.0 | 574.0 | 525.0 | 1621.0 | NEW EXPLORATIONS SCI,TECH,MATH | 255.000000 | 377.000000 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 111 Columbia Street\nNew York, NY 10002\n(40.7... | 40.718725 | -73.979426 | 1 |
5 rows × 160 columns
survey_fields = [
"DBN",
"rr_s",
"rr_t",
"rr_p",
"N_s",
"N_t",
"N_p",
"saf_p_11",
"com_p_11",
"eng_p_11",
"aca_p_11",
"saf_t_11",
"com_t_11",
"eng_t_11",
"aca_t_11",
"saf_s_11",
"com_s_11",
"eng_s_11",
"aca_s_11",
"saf_tot_11",
"com_tot_11",
"eng_tot_11",
"aca_tot_11",
"sat_score"
]
# Remove DBN since it's a unique identifier, not a useful numerical value for correlation.
survey_fields.remove("DBN")
There are several fields in combined that originally came from a survey of parents, teachers, and students. Make a bar plot of the correlations between these fields and sat_score
.
survey_corr = combined[survey_fields].corr()
survey_corr = survey_corr.reset_index().sort_values(by='sat_score')
survey_corr.plot.barh(x='index',y='sat_score');
From the above we can see that saf_s_11
and saf_t_11
correlates nicely with sat_score.
N_s
and N_p
which indicate the number of respondants have a high correlation. This is expected as they are proxies tototal_enrollment
. is worth investigating. Also, aca_s_11
which is the academic rating of a school by students has a positive relationship with sat_scores
so it would be wise to take up students opinion on school as opposed to teacher or parents.
On the last screen, you may have noticed that saf_t_11
and saf_s_11
, which measure how teachers and students perceive safety at school, correlated highly with sat_score
. On this screen, we'll dig into this relationship a bit more, and try to figure out which schools have low safety scores.
combined.plot.scatter("saf_s_11","sat_score");
We can see that there is a cluster of schools above the 1800 for scores greater than 7.0 These schools are worth investigating for a parent looking for a good school for the child. The relationship is linear for most of the observations. The sweet spot is schools with a 7.0 - 8.5 rating score. The correlation is not that strong.
# Compute the average safety score for each district.
districts = combined.groupby('school_dist').agg(np.mean)
import os
os.environ['PROJ_LIB'] = 'C:\\a\\Lib\\site-packages\\mpl_toolkits\\basemap'
from mpl_toolkits.basemap import Basemap
def nyc_plot_district(fieldname):
fig,ax = plt.subplots(figsize = (6,6))
m = Basemap(projection = 'merc',
llcrnrlat = 40.496044,
urcrnrlat = 40.915256,
llcrnrlon = -74.255735,
urcrnrlon = -73.700272,
resolution = 'h')
m.drawcoastlines(color = 'black', linewidth = 1)
m.drawmapboundary(fill_color = '#85A6D9')
# Creating scatterplot
m.scatter(district['lon'].tolist(),
district['lat'].tolist(),
zorder = 2, s=20,
latlon = True,
c=district[fieldname],
cmap = 'summer')
if fieldname == 'saf_s_11':
ax.set_title('Heat-Map: District Wise Safety Scores for NYC Schools')
plt.show()
def nyc_plot_school(df):
fig,ax = plt.subplots(figsize = (6,6))
m = Basemap(projection = 'merc',
llcrnrlat = 40.496044,
urcrnrlat = 40.915256,
llcrnrlon = -74.255735,
urcrnrlon = -73.700272,
resolution = 'i')
m.drawcoastlines(color = 'black', linewidth = 1)
m.drawmapboundary(fill_color = '#85A6D9')
# Creating scatterplot
m.scatter(df['lon'].tolist(),
df['lat'].tolist(),
zorder = 2, s=20,
latlon = True,
c='black')
ax.set_title('Scatter Plot: NYC Schools')
plt.show()
# Make a map that shows safety scores by district.
m = Basemap(
projection='merc',
llcrnrlat=40.496044,
urcrnrlat=40.915256,
llcrnrlon=-74.255735,
urcrnrlon=-73.700272,
resolution='i'
)
m.drawmapboundary(fill_color='#85A6D9')
m.drawcoastlines(color='black', linewidth=.8)
m.drawrivers(color='#6D5F47', linewidth=.8)
latitudes = districts['lat'].tolist()
longitudes = districts['lon'].tolist()
m.scatter(longitudes,latitudes,s=20,latlon=True,zorder=2,c = districts["saf_s_11"], cmap="summer")
plt.colorbar()
C:\a\lib\site-packages\ipykernel_launcher.py:8: MatplotlibDeprecationWarning: The dedent function was deprecated in Matplotlib 3.1 and will be removed in 3.3. Use inspect.cleandoc instead. C:\a\lib\site-packages\ipykernel_launcher.py:13: MatplotlibDeprecationWarning: The dedent function was deprecated in Matplotlib 3.1 and will be removed in 3.3. Use inspect.cleandoc instead. del sys.path[0]
<matplotlib.colorbar.Colorbar at 0x1ad883fb160>
Shools in Manhattan,Bronx and Queens have higher saftey scores as against schools in Brooklyn.
There are a few columns that indicate the percentage of each race at a given school:
By plotting out the correlations between these columns and sat_score
, we can determine whether there are any racial differences in SAT performance.
races = ['white_per','asian_per','black_per','hispanic_per']
combined.corr()['sat_score'][races].sort_values().plot.barh();
The above chart is an indication that whites and asians score higher(or at least schools with high percent of them) whereas blacks and hispanics do not.
Let's dig deeper into schools with high value for hispanic_per
and low SAT scores.
combined.plot.scatter(y="sat_score", x="hispanic_per", title="Hispanic Percent vs SAT score");
There is a cluster of schools with 100% hispanic_per
and sat_scores
below 1200. Schools with hispanic_per
greater than 30 often do not go above 1500 for sat_score
. Since the aim to to investigate schools with high hispanic_per
and low SAT scores; we dig into schools with over 95% hispanic population.
high_hispanic = combined[combined['hispanic_per'] > 95]
high_hispanic.plot.scatter(y="sat_score", x="hispanic_per", title="High Hispanic Percent vs SAT score");
Eight schools fall into this category. It would be intresting to see if they have anything in common. These schools are:
high_hispanic.T
44 | 82 | 89 | 125 | 141 | 176 | 253 | 286 | |
---|---|---|---|---|---|---|---|---|
DBN | 02M542 | 06M348 | 06M552 | 09X365 | 10X342 | 12X388 | 19K583 | 24Q296 |
SCHOOL NAME | MANHATTAN BRIDGES HIGH SCHOOL | WASHINGTON HEIGHTS EXPEDITIONARY LEARNING SCHOOL | GREGORIO LUPERON HIGH SCHOOL FOR SCIENCE AND M... | ACADEMY FOR LANGUAGE AND TECHNOLOGY | INTERNATIONAL SCHOOL FOR LIBERAL ARTS | PAN AMERICAN INTERNATIONAL HIGH SCHOOL AT MONROE | MULTICULTURAL HIGH SCHOOL | PAN AMERICAN INTERNATIONAL HIGH SCHOOL |
Num of SAT Test Takers | 66 | 70 | 56 | 54 | 49 | 30 | 29 | 55 |
SAT Critical Reading Avg. Score | 336 | 380 | 339 | 315 | 300 | 321 | 279 | 317 |
SAT Math Avg. Score | 378 | 395 | 349 | 339 | 333 | 351 | 322 | 323 |
SAT Writing Avg. Score | 344 | 399 | 326 | 297 | 301 | 298 | 286 | 311 |
sat_score | 1058 | 1174 | 1014 | 951 | 934 | 970 | 887 | 951 |
SchoolName | Manhattan Bridges High School | 0 | GREGORIO LUPERON HS SCI & MATH | Academy for Language and Technology | International School for Liberal Arts | 0 | Multicultural High School | 0 |
AP Test Takers | 67 | 129.029 | 88 | 20 | 55 | 129.029 | 44 | 129.029 |
Total Exams Taken | 102 | 197.038 | 138 | 20 | 73 | 197.038 | 44 | 197.038 |
Number of Exams with scores 3 4 or 5 | 59 | 153.45 | 73 | 20 | 45 | 153.45 | 39 | 153.45 |
Demographic | Total Cohort | 0 | Total Cohort | 0 | Total Cohort | 0 | Total Cohort | Total Cohort |
School Name | MANHATTAN BRIDGES HIGH SCHOOL | 0 | GREGORIO LUPERON HIGH SCHOOL FOR SCIE | 0 | INTERNATIONAL SCHOOL FOR LIBERAL ARTS | 0 | MULTICULTURAL HIGH SCHOOL | PAN AMERICAN INTERNATIONAL HIGH SCHOO |
Cohort | 2006 | 0 | 2006 | 0 | 2006 | 0 | 2006 | 2006 |
Total Cohort | 111 | 193.871 | 91 | 193.871 | 83 | 193.871 | 3 | 1 |
Total Grads - n | 77 | 0 | 74 | 0 | 49 | 0 | s | s |
Total Grads - % of cohort | 69.400000000000006% | 0 | 81.3% | 0 | 59% | 0 | s | s |
Total Regents - n | 63 | 0 | 49 | 0 | 38 | 0 | s | s |
Total Regents - % of cohort | 56.8% | 0 | 53.8% | 0 | 45.8% | 0 | s | s |
Total Regents - % of grads | 81.8% | 0 | 66.2% | 0 | 77.599999999999994% | 0 | s | s |
Advanced Regents - n | 16 | 0 | 13 | 0 | 0 | 0 | s | s |
Advanced Regents - % of cohort | 14.4% | 0 | 14.3% | 0 | 0% | 0 | s | s |
Advanced Regents - % of grads | 20.8% | 0 | 17.600000000000001% | 0 | 0% | 0 | s | s |
Regents w/o Advanced - n | 47 | 0 | 36 | 0 | 38 | 0 | s | s |
Regents w/o Advanced - % of cohort | 42.3% | 0 | 39.6% | 0 | 45.8% | 0 | s | s |
Regents w/o Advanced - % of grads | 61% | 0 | 48.6% | 0 | 77.599999999999994% | 0 | s | s |
Local - n | 14 | 0 | 25 | 0 | 11 | 0 | s | s |
Local - % of cohort | 12.6% | 0 | 27.5% | 0 | 13.3% | 0 | s | s |
Local - % of grads | 18.2% | 0 | 33.799999999999997% | 0 | 22.4% | 0 | s | s |
Still Enrolled - n | 17 | 0 | 8 | 0 | 23 | 0 | s | s |
... | ... | ... | ... | ... | ... | ... | ... | ... |
partner_cbo | The Young Men and Women Hebrew Association (YM... | Alianza Dominicana, Culturarte, Point of View | Children’s Aid Society, Leadership Program, Wo... | Morris Heights Clinic | 0 | Internationals Network for Public Schools, Urb... | Cypress Hills Local Development Corporation | Make the Road New York, Latino Youth for Highe... |
partner_hospital | St. Vincent's Hospital, New York-Presbyterian ... | New York-Presbyterian Hospital | 0 | St. Jude Children's Research Hospital | 0 | 0 | North Shore/ Long Island Jewish Hospital | 0 |
partner_highered | Cornell University, Columbia University, Roche... | 0 | The City College, Lehman College, Bronx Commun... | Monroe College, Skidmore College, New York Uni... | Fordham University | College Now at Hostos Community College | LaGuardia Community College | Columbia University, New York University |
partner_cultural | El Museo del Barrio, Carnegie Hall, Metropolit... | Bright Lights/Young Audiences New York, City a... | Lincoln Center for the Performing Arts and Las... | 0 | 0 | Learning through Ecology and Environmental Fie... | 0 | Queens Museum of Art, Metropolitan Museum of M... |
partner_nonprofit | New Visions for Public Schools, National Acade... | Isabella Care Center, Medical Center Nursery S... | 0 | Pencil Foundation | 0 | College Action: Research & Action (CARA), New ... | PENCIL | Internationals Network for Public Schools |
partner_corporate | Latin Vision Media, Urban Latino Magazine Inc.... | Harper Collins, NBC, The New York Times, Schoo... | 0 | 0 | 0 | 0 | 0 | 0 |
partner_financial | Chase Manhattan Bank, Bank of America | Neighborhood Trust Credit Union, Capital One | 0 | 0 | 0 | 0 | 0 | 0 |
partner_other | Manhattan District Attorney's Office, New York... | 0 | 0 | We are a National Academy Foundation (NAF) sch... | 0 | 0 | 0 | 0 |
addtl_info1 | Dress Code Required: white shirt/blouse, black... | Uniform Required: school shirt (available for ... | Uniform Required: white shirt/blouse, navy blu... | Students are expected to complete a project by... | Uniform Required: Boys - white collared shirt,... | Student and Parent Summer Orientation, Academi... | Uniform Required: light blue shirt with school... | 0 |
addtl_info2 | Community Service Requirement, Extended Day Pr... | 0 | 0 | Community Service Requirement, Extended Day Pr... | Community Service Requirement | Community Service Requirement, Extended Day Pr... | 0 | Community Service Requirement, Internship Requ... |
start_time | 8:00 AM | 8:00 AM | 8:00 AM | 8:00 AM | 8:00 AM | 8:30 AM | 8:15 AM | 8:30 AM |
end_time | 3:45 PM | 3:00 PM | 3:30 PM | 4:00 PM | 3:45 PM | 5:30 PM | 3:15 PM | 3:15 PM |
se_services | This school will provide students with disabil... | This school will provide students with disabil... | This school will provide students with disabil... | This school will provide students with disabil... | This school will provide students with disabil... | This school will provide students with disabil... | This school will provide students with disabil... | This school will provide students with disabil... |
ell_programs | ESL; Dual Language: Spanish; Transitional Bili... | ESL | ESL; Transitional Bilingual Program: Spanish | ESL; Transitional Bilingual Program: Spanish | ESL; Transitional Bilingual Program: Spanish | ESL | ESL; Transitional Bilingual Program: Spanish | ESL |
school_accessibility_description | Functionally Accessible | Functionally Accessible | Functionally Accessible | Not Functionally Accessible | Functionally Accessible | Functionally Accessible | Functionally Accessible | Functionally Accessible |
number_programs | 4 | 1 | 1 | 3 | 1 | 1 | 1 | 1 |
priority01 | Open only to New York City residents whose hom... | Priority to continuing 8th graders | Open only to New York City residents who have ... | Open only to New York City residents who have ... | Open only to New York City residents whose hom... | Open only to New York City residents living in... | Open only to New York City residents who have ... | Open only to New York City residents living in... |
priority02 | 0 | Then to District 6 students or residents who a... | Priority to Manhattan students or residents wh... | 0 | Priority to continuing 8th graders | 0 | 0 | 0 |
priority03 | 0 | Then to New York City residents who attend an ... | Then to Bronx students or residents who have l... | 0 | Then to New York City residents | 0 | 0 | 0 |
priority04 | 0 | Then to District 6 students or residents | Then to New York City residents who have lived... | 0 | 0 | 0 | 0 | 0 |
priority05 | 0 | Then to New York City residents | 0 | 0 | 0 | 0 | 0 | 0 |
priority06 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
priority07 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
priority08 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
priority09 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
priority10 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Location 1 | 525 West 50Th Street\nNew York, NY 10019\n(40.... | 511 West 182Nd Street\nNew York, NY 10033\n(40... | 501 West 165Th\nNew York, NY 10032\n(40.838032... | 1700 Macombs Road\nBronx, NY 10453\n(40.849102... | 2780 Reservoir Avenue\nBronx, NY 10468\n(40.87... | 1300 Boynton Avenue\nBronx, NY 10472\n(40.8313... | 999 Jamaica Avenue\nBrooklyn, NY 11208\n(40.69... | 45-10 94Th Street\nElmhurst, NY 11373\n(40.743... |
lat | 40.765 | 40.8489 | 40.838 | 40.8491 | 40.8704 | 40.8314 | 40.6911 | 40.7433 |
lon | -73.9925 | -73.9308 | -73.9384 | -73.9161 | -73.8982 | -73.8788 | -73.8684 | -73.8706 |
school_dist | 2 | 6 | 6 | 9 | 10 | 12 | 19 | 24 |
160 rows × 8 columns
We try to gain insight into these schools by investing if they have any feature that correlates nicely with SAT scores. We focus on features that have negative correlations with SAT score to narrow our scope and focus on why these schools have low SAT scores.
hi_hispanic_corr = high_hispanic.corr()
mask = hi_hispanic_corr['sat_score'] < -0.25
hi_hispanic_corr[mask]['sat_score']
Cohort -0.363129 CSD -0.629173 NUMBER OF STUDENTS / SEATS FILLED -0.513313 NUMBER OF SECTIONS -0.569436 ell_num -0.676540 ell_percent -0.885803 hispanic_per -0.805102 female_per -0.820237 rr_t -0.442326 zip -0.705854 lon -0.640916 school_dist -0.629173 Name: sat_score, dtype: float64
Columns: zip
,lon
,school_dist
,CSD
do not have any statistical significance. As expected, hispanic_per
has a high negative correlation. We notice one key insight here; ell_percent
- the percenrage of English learners - affects the SAT scores negatively. Since these schools are mostly hispanic, it makes sense they have a high number of English learners and this would affect the comprehension of subjects taught in English and hence result in lower SAT scores. There is also a pointer that higher number of females in hispanic schools results in lower SAT scores.
Let us make a contradictory investigation. Looking into schools with a hispanic_per
less than 10% and an average SAT score greater than 1800
.
low_hispanic = combined[(combined['hispanic_per'] < 10) & (combined['sat_score'] > 1800)]
low_hispanic.T
37 | 151 | 187 | 327 | 356 | |
---|---|---|---|---|---|
DBN | 02M475 | 10X445 | 13K430 | 28Q687 | 31R605 |
SCHOOL NAME | STUYVESANT HIGH SCHOOL | BRONX HIGH SCHOOL OF SCIENCE | BROOKLYN TECHNICAL HIGH SCHOOL | QUEENS HIGH SCHOOL FOR THE SCIENCES AT YORK CO... | STATEN ISLAND TECHNICAL HIGH SCHOOL |
Num of SAT Test Takers | 832 | 731 | 1277 | 121 | 227 |
SAT Critical Reading Avg. Score | 679 | 632 | 587 | 612 | 635 |
SAT Math Avg. Score | 735 | 688 | 659 | 660 | 682 |
SAT Writing Avg. Score | 682 | 649 | 587 | 596 | 636 |
sat_score | 2096 | 1969 | 1833 | 1868 | 1953 |
SchoolName | STUYVESANT HS | BRONX HS OF SCIENCE | BROOKLYN TECHNICAL HS | Queens HS for Science York Colllege | STATEN ISLAND TECHNICAL HS |
AP Test Takers | 1510 | 1190 | 2117 | 215 | 528 |
Total Exams Taken | 2819 | 2435 | 3692 | 338 | 905 |
Number of Exams with scores 3 4 or 5 | 2648 | 2189 | 2687 | 275 | 809 |
Demographic | Total Cohort | Total Cohort | Total Cohort | Total Cohort | Total Cohort |
School Name | STUYVESANT HIGH SCHOOL | BRONX HIGH SCHOOL OF SCIENCE | BROOKLYN TECHNICAL HIGH SCHOOL | QUEENS HIGH SCHOOL FOR THE SCIENCES A | STATEN ISLAND TECHNICAL HIGH SCHOOL |
Cohort | 2006 | 2006 | 2006 | 2006 | 2006 |
Total Cohort | 787 | 668 | 1097 | 107 | 288 |
Total Grads - n | 774 | 657 | 1003 | 102 | 287 |
Total Grads - % of cohort | 98.3% | 98.4% | 91.4% | 95.3% | 99.7% |
Total Regents - n | 774 | 657 | 1003 | 102 | 287 |
Total Regents - % of cohort | 98.3% | 98.4% | 91.4% | 95.3% | 99.7% |
Total Regents - % of grads | 100% | 100% | 100% | 100% | 100% |
Advanced Regents - n | 770 | 653 | 898 | 101 | 276 |
Advanced Regents - % of cohort | 97.8% | 97.8% | 81.900000000000006% | 94.4% | 95.8% |
Advanced Regents - % of grads | 99.5% | 99.4% | 89.5% | 99% | 96.2% |
Regents w/o Advanced - n | 4 | 4 | 105 | 1 | 11 |
Regents w/o Advanced - % of cohort | 0.5% | 0.6% | 9.6% | 0.9% | 3.8% |
Regents w/o Advanced - % of grads | 0.5% | 0.6% | 10.5% | 1% | 3.8% |
Local - n | 0 | 0 | 0 | 0 | 0 |
Local - % of cohort | 0% | 0% | 0% | 0% | 0% |
Local - % of grads | 0% | 0% | 0% | 0% | 0% |
Still Enrolled - n | 10 | 11 | 87 | 5 | 1 |
... | ... | ... | ... | ... | ... |
partner_cbo | 0 | Riverdale Young Men/Women Hebrew Association (... | Brooklyn Tech Alumni Foundation, Ft. Greene As... | 0 | United Activities Unlimited (UAU), Seamen's So... |
partner_hospital | Bellevue Hospital Center, New York-Presbyteria... | The Hebrew Home for the Aged | Brooklyn Hospital Center, Mount Sinai Hospital | Mount Sinai School of Medicine | Staten Island University Hospital, Richmond Un... |
partner_highered | New York Law School, New York University (NYU)... | State University of New York (SUNY) Albany, Co... | Polytechnic University, Long Island University... | York College | MIT, Columbia University, St. John’s Universit... |
partner_cultural | 0 | ArtsConnection, Scholastic Arts and Writing, M... | Brooklyn Academy of Music, Manhattan Theater C... | 92nd Street Y, Brooklyn Academy of Music (BAM)... | Snug Harbor, Staten Island Children's Museum, ... |
partner_nonprofit | American Red Cross, UNICEF, American Cancer So... | Bronx Zoo, Hennessy Family Foundation | Brooklyn Tech Alumni Foundation, Junior Achiev... | 0 | 0 |
partner_corporate | 0 | Akin Gump Strauss Hauer & Feld LLP, Con Edison | Con Edison, National Grid, Mancini-Duffy Archi... | 0 | Port Authority of New York and New Jersey Con ... |
partner_financial | Citicorp, Goldman Sachs | Credit Suisse | 0 | 0 | 0 |
partner_other | Explorer Program | U.S. Department of State, New York State Supre... | 0 | 0 | 0 |
addtl_info1 | This is one of New York City's eight (8) Speci... | ), and a list of sports, coaches, and achievem... | from your guidance counselor. | This is one of New York City's eight (8) Speci... | This is one of New York City’s eight (8) Speci... |
addtl_info2 | 0 | 0 | 0 | 0 | 0 |
start_time | 8:00 AM | 8:00 AM | 8:45 AM | 8:00 AM | 7:45 AM |
end_time | 3:30 PM | 3:45 PM | 3:15 PM | 3:18 PM | 2:30 PM |
se_services | This school will provide students with disabil... | This school will provide students with disabil... | This school will provide students with disabil... | This school will provide students with disabil... | This school will provide students with disabil... |
ell_programs | ESL | ESL | ESL | ESL | ESL |
school_accessibility_description | Functionally Accessible | Functionally Accessible | Functionally Accessible | Functionally Accessible | Functionally Accessible |
number_programs | 1 | 1 | 1 | 1 | 1 |
priority01 | Open to New York City residents | Open to New York City residents | Open to New York City residents | Open to New York City residents who take the S... | Open to New York City residents |
priority02 | 0 | 0 | 0 | 0 | 0 |
priority03 | 0 | 0 | 0 | 0 | 0 |
priority04 | 0 | 0 | 0 | 0 | 0 |
priority05 | 0 | 0 | 0 | 0 | 0 |
priority06 | 0 | 0 | 0 | 0 | 0 |
priority07 | 0 | 0 | 0 | 0 | 0 |
priority08 | 0 | 0 | 0 | 0 | 0 |
priority09 | 0 | 0 | 0 | 0 | 0 |
priority10 | 0 | 0 | 0 | 0 | 0 |
Location 1 | 345 Chambers Street\nNew York, NY 10282\n(40.7... | 75 West 205 Street\nBronx, NY 10468\n(40.87705... | 29 Ft Greene Place\nBrooklyn, NY 11217\n(40.68... | 94-50 159 Street\nJamaica, NY 11433\n(40.70099... | 485 Clawson Street\nStaten Island, NY 10306\n(... |
lat | 40.7177 | 40.8771 | 40.6881 | 40.701 | 40.5679 |
lon | -74.014 | -73.8898 | -73.9767 | -73.7982 | -74.1154 |
school_dist | 2 | 10 | 13 | 28 | 31 |
160 rows × 5 columns
low_hispanic_corr = low_hispanic.corr()
mask = low_hispanic_corr['sat_score'] > 0.25
low_hispanic_corr[mask]['sat_score']
SAT Critical Reading Avg. Score 0.985722 SAT Math Avg. Score 0.982875 SAT Writing Avg. Score 0.987101 sat_score 1.000000 Number of Exams with scores 3 4 or 5 0.323419 SIZE OF LARGEST CLASS 0.382332 male_per 0.587097 com_p_11 0.368534 eng_p_11 0.563521 saf_s_11 0.299506 eng_s_11 0.253896 Name: sat_score, dtype: float64
Nothing quite pops as we look at these numbers. Students think the school is relatively safe. We futher our analysis by seeking information else where say in google.
On further investigation, I found that most of this schools are Technical schools and require there students to pass a standardized test before admission. We can assume they have "better students".
Moving forward, we look into the data to find if there is any gender bias in SAT scores.
gender = ['male_per', "female_per"]
combined.corr()['sat_score'][gender].plot.barh();
# fields_corr = combined[fields].corr()
# fields_corr.plot.barh("male_per","sat_score");
There is a positive correlation for females and negative correlation for males. Nevertheless, the correlation is not strong.
combined.plot.scatter('female_per','sat_score', title="Female Percent Vs. SAT score");
The plot shows no significant relationship. We can see that most schools have female percent in the range 40-60. Schools with above 60% females tend to have SAT scores above 1000 and there are some all girls school which seem to always do well in SATs. We focus on schools with high female percent and high SAT scores.
combined[(combined['female_per'] > 60) & (combined['sat_score'] > 1700)][['SCHOOL NAME','sat_score']]
SCHOOL NAME | sat_score | |
---|---|---|
5 | BARD HIGH SCHOOL EARLY COLLEGE | 1856.0 |
26 | ELEANOR ROOSEVELT HIGH SCHOOL | 1758.0 |
60 | BEACON HIGH SCHOOL | 1744.0 |
61 | FIORELLO H. LAGUARDIA HIGH SCHOOL OF MUSIC & A... | 1707.0 |
302 | TOWNSEND HARRIS HIGH SCHOOL | 1910.0 |
On Googling these schools, I found out that these were high standard Art schools.
High school students(in the US) take Advanced Placement (AP) exams to earn college credit.
It makes sense that the number of students at a school who took AP exams would be highly correlated with the school's SAT scores. Let's explore this relationship.
# let us recall our dataset
combined.head().T
0 | 1 | 2 | 3 | 4 | |
---|---|---|---|---|---|
DBN | 01M292 | 01M448 | 01M450 | 01M509 | 01M539 |
SCHOOL NAME | HENRY STREET SCHOOL FOR INTERNATIONAL STUDIES | UNIVERSITY NEIGHBORHOOD HIGH SCHOOL | EAST SIDE COMMUNITY SCHOOL | MARTA VALLE HIGH SCHOOL | NEW EXPLORATIONS INTO SCIENCE, TECHNOLOGY AND ... |
Num of SAT Test Takers | 29 | 91 | 70 | 44 | 159 |
SAT Critical Reading Avg. Score | 355 | 383 | 377 | 390 | 522 |
SAT Math Avg. Score | 404 | 423 | 402 | 433 | 574 |
SAT Writing Avg. Score | 363 | 366 | 370 | 384 | 525 |
sat_score | 1122 | 1172 | 1149 | 1207 | 1621 |
SchoolName | 0 | UNIVERSITY NEIGHBORHOOD H.S. | EAST SIDE COMMUNITY HS | 0 | NEW EXPLORATIONS SCI,TECH,MATH |
AP Test Takers | 129.029 | 39 | 19 | 129.029 | 255 |
Total Exams Taken | 197.038 | 49 | 21 | 197.038 | 377 |
Number of Exams with scores 3 4 or 5 | 153.45 | 10 | 153.45 | 153.45 | 191 |
Demographic | Total Cohort | Total Cohort | Total Cohort | Total Cohort | Total Cohort |
School Name | HENRY STREET SCHOOL FOR INTERNATIONAL | UNIVERSITY NEIGHBORHOOD HIGH SCHOOL | EAST SIDE COMMUNITY SCHOOL | MARTA VALLE HIGH SCHOOL | NEW EXPLORATIONS INTO SCIENCE TECHNO |
Cohort | 2006 | 2006 | 2006 | 2006 | 2006 |
Total Cohort | 78 | 124 | 90 | 84 | 46 |
Total Grads - n | 43 | 53 | 70 | 47 | 46 |
Total Grads - % of cohort | 55.1% | 42.7% | 77.8% | 56% | 100% |
Total Regents - n | 36 | 42 | 67 | 40 | 46 |
Total Regents - % of cohort | 46.2% | 33.9% | 74.400000000000006% | 47.6% | 100% |
Total Regents - % of grads | 83.7% | 79.2% | 95.7% | 85.1% | 100% |
Advanced Regents - n | 0 | 8 | 0 | 17 | 31 |
Advanced Regents - % of cohort | 0% | 6.5% | 0% | 20.2% | 67.400000000000006% |
Advanced Regents - % of grads | 0% | 15.1% | 0% | 36.200000000000003% | 67.400000000000006% |
Regents w/o Advanced - n | 36 | 34 | 67 | 23 | 15 |
Regents w/o Advanced - % of cohort | 46.2% | 27.4% | 74.400000000000006% | 27.4% | 32.6% |
Regents w/o Advanced - % of grads | 83.7% | 64.2% | 95.7% | 48.9% | 32.6% |
Local - n | 7 | 11 | 3 | 7 | 0 |
Local - % of cohort | 9% | 8.9% | 3.3% | 8.300000000000001% | 0% |
Local - % of grads | 16.3% | 20.8% | 4.3% | 14.9% | 0% |
Still Enrolled - n | 16 | 46 | 15 | 25 | 0 |
... | ... | ... | ... | ... | ... |
partner_cbo | The Henry Street Settlement; Asia Society; Ame... | Grand Street Settlement, Henry Street Settleme... | University Settlement, Big Brothers Big Sister... | NYCDOE Innovation Zone Lab Site, Grand Street ... | 7th Precinct Community Affairs, NYCWastele$$, ... |
partner_hospital | Gouverneur Hospital (Turning Points) | Gouverneur Hospital, The Door, The Mount Sinai... | 0 | Gouvenuer's Hospital | 0 |
partner_highered | New York University | New York University, CUNY Baruch College, Pars... | Columbia Teachers College, New York University... | New York University (NYU), Sarah Lawrence Coll... | Hunter College, New York University, Cornell U... |
partner_cultural | Asia Society | Dance Film Association, Dance Makers Film Work... | , Internship Program, Loisaida Art Gallery loc... | Young Audiences, The National Arts Club, Educa... | VH1, Dancing Classrooms, Center for Arts Educa... |
partner_nonprofit | Heart of America Foundation | W!SE, Big Brothers Big Sisters, Peer Health Ex... | College Bound Initiative, Center for Collabora... | College for Every Student (CFES), Morningside ... | After 3 |
partner_corporate | 0 | Deloitte LLP Consulting and Financial Services... | Prudential Securities, Moore Capital, Morgan S... | Estée Lauder | Time Warner Cable, Google, IBM, MET Project, S... |
partner_financial | 0 | 0 | 0 | Bank of America | 0 |
partner_other | United Nations | Movement Research | Brooklyn Boulders (Rock Climbing) | CASALEAP, Beacon | 0 |
addtl_info1 | 0 | Incoming students are expected to attend schoo... | Students present and defend their work to comm... | Students Dress for Success, Summer Bridge to S... | Dress Code Required: Business Casual - shirt/b... |
addtl_info2 | 0 | Community Service Requirement, Dress Code Requ... | Our school requires an Academic Portfolio for ... | Community Service Requirement, Extended Day Pr... | 0 |
start_time | 8:30 AM | 8:15 AM | 8:30 AM | 8:00 AM | 8:15 AM |
end_time | 3:30 PM | 3:15 PM | 3:30 PM | 3:30 PM | 4:00 PM |
se_services | This school will provide students with disabil... | This school will provide students with disabil... | This school will provide students with disabil... | This school will provide students with disabil... | This school will provide students with disabil... |
ell_programs | ESL | ESL | ESL | ESL | ESL |
school_accessibility_description | Functionally Accessible | Not Functionally Accessible | Not Functionally Accessible | Functionally Accessible | Not Functionally Accessible |
number_programs | 1 | 3 | 1 | 1 | 1 |
priority01 | Priority to continuing 8th graders | Open to New York City residents | Priority to continuing 8th graders | Priority to District 1 students or residents | Priority to continuing 8th graders |
priority02 | Then to Manhattan students or residents who at... | For M35B only: Open only to students whose hom... | Then to New York City residents | Then to Manhattan students or residents | Then to New York City residents |
priority03 | Then to New York City residents who attend an ... | 0 | 0 | Then to New York City residents | 0 |
priority04 | Then to Manhattan students or residents | 0 | 0 | 0 | 0 |
priority05 | Then to New York City residents | 0 | 0 | 0 | 0 |
priority06 | 0 | 0 | 0 | 0 | 0 |
priority07 | 0 | 0 | 0 | 0 | 0 |
priority08 | 0 | 0 | 0 | 0 | 0 |
priority09 | 0 | 0 | 0 | 0 | 0 |
priority10 | 0 | 0 | 0 | 0 | 0 |
Location 1 | 220 Henry Street\nNew York, NY 10002\n(40.7137... | 200 Monroe Street\nNew York, NY 10002\n(40.712... | 420 East 12 Street\nNew York, NY 10009\n(40.72... | 145 Stanton Street\nNew York, NY 10002\n(40.72... | 111 Columbia Street\nNew York, NY 10002\n(40.7... |
lat | 40.7138 | 40.7123 | 40.7298 | 40.7206 | 40.7187 |
lon | -73.9853 | -73.9848 | -73.983 | -73.9857 | -73.9794 |
school_dist | 1 | 1 | 1 | 1 | 1 |
160 rows × 5 columns
# we get the percentage of students who took the ap exam
# AP Test Takers column is the number(average) of students that took the AP exams
combined['ap_per'] = (combined['AP Test Takers '] / combined['total_enrollment']) * 100
# ap_per.head()
combined.plot.scatter('ap_per','sat_score');
combined.corr()['sat_score']['ap_per']
0.0571708139076698
There is a higher chance of a school having better SAT scores if the percentage of AP exam takers is above 50. The correlation is not significant.
combined.corr()['sat_score'][['AVERAGE CLASS SIZE','NUMBER OF STUDENTS / SEATS FILLED']]
AVERAGE CLASS SIZE 0.381014 NUMBER OF STUDENTS / SEATS FILLED 0.394626 Name: sat_score, dtype: float64
combined.plot.scatter('AVERAGE CLASS SIZE','sat_score');
combined.plot.scatter('NUMBER OF STUDENTS / SEATS FILLED','sat_score');
There is definetly a point to be made for larger classes. This is made more evident by the NUMBER OF STUDENTS/ SEATS FILLED
column. Schools with higher population have a higher tendency of having higher SAT scores.
From the subset of data analysed, these conclusions were reached.