ZBP to ZCTA¶

Retrieves data from the Census Bureau's ZIP Code Business Patterns API for a specific area and summarizes it by ZCTA. Three tables are generated: one for employees that contains employment, establishments, and wages, one for industries that contains counts of establishments by 2-digit sector NAICS codes, and one reference table that correlates sector numbers and names. Initial data retrieved from API is written to json, final output is written to a SQLite database.

Confirmed to work with the 2018 ZBP series

https://www.census.gov/data/developers/data-sets/cbp-nonemp-zbp/zbp-api.html

Variables¶

In [1]:

import pandas as pd, requests, sqlite3, os, json
from IPython.display import clear_output

In [2]:

#Crosswalk files - update only if necessary
uszips_file='zip_to_zcta_2019_uds.csv'
zcta_file='geocorr14_modified.csv'

uszips_path=os.path.join('inputs',uszips_file)
zcta_path=zcta_file=os.path.join('inputs',zcta_file)

#Dump files for api data storage
ejsonpath=os.path.join('outputs', 'emp_data.json')
ijsonpath=os.path.join('outputs', 'ind_data.json')
cjsonpath=os.path.join('outputs', 'codes_data.json')

#API variables - UPDATE THE YEAR
keyfile='census_key.txt'

year='2018'
dsource='zbp'
state='36'
ecols='ESTAB,EMP,PAYQTR1,PAYANN'
icols='ESTAB'
ncodes=['00','11','21','22','23','31-33','42','44-45','48-49',
        '51','52','53','54','55','56','61','62','71','72','81',
        '99']

#SQL output - UPDATE EACH TABLE NAME
dbname=os.path.join('outputs','testdb.sqlite')
emptable='zbp2018emp'
indtable='zbp2018ind'
codetable='zbp2018indcodes'

Crosswalking¶

Read in the files that relate US ZIP codes to ZCTAs (from JSI) and ZCTAs to counties for local area (from MCDC Geocorr), then join them by ZIP Code to create a ZCTA to ZIP table for the local area

In [3]:

uszips=pd.read_csv(uszips_path, sep=',', dtype={'ZIP_CODE':str, 'ZCTA':str})
uszips.head()

Out[3]:

	ZIP_CODE	PO_NAME	STATE	ZIP_TYPE	ZCTA	zip_join_type
0	00501	Holtsville	NY	Post Office or large volume customer	11742	Spatial join to ZCTA
1	00544	Holtsville	NY	Post Office or large volume customer	11742	Spatial join to ZCTA
2	00601	Adjuntas	PR	Zip Code Area	00601	Zip Matches ZCTA
3	00602	Aguada	PR	Zip Code Area	00602	Zip Matches ZCTA
4	00603	Aguadilla	PR	Zip Code Area	00603	Zip Matches ZCTA

In [4]:

#All ZIP Codes in US
uszips.shape

Out[4]:

(41107, 6)

In [5]:

zcta=pd.read_csv(zcta_path, sep=',', dtype={'zcta5':str, 'county14':str})
zcta.head()

Out[5]:

	zcta5	county14	cntyname2	zipname	pop10	afact
0	10001	36061	New York NY	New York, NY	21102	1.0
1	10002	36061	New York NY	New York, NY	81410	1.0
2	10003	36061	New York NY	New York, NY	56024	1.0
3	10004	36061	New York NY	New York, NY	3089	1.0
4	10005	36061	New York NY	New York, NY	7135	1.0

In [6]:

#ZCTAs in local area
zcta.shape

Out[6]:

(214, 6)

In [7]:

#Merge ZIP Codes with ZCTAs for local area
zip2zcta = pd.merge(uszips[['ZIP_CODE','ZIP_TYPE','PO_NAME','ZCTA']],zcta[['zcta5','county14']],how='right', 
                    left_on='ZCTA', right_on='zcta5').set_index('ZIP_CODE')
zip2zcta.drop(columns=['ZCTA'],inplace=True)
zip2zcta.head()

Out[7]:

	ZIP_TYPE	PO_NAME	zcta5	county14
ZIP_CODE
10001	Zip Code Area	New York	10001	36061
10118	Post Office or large volume customer	New York	10001	36061
10120	Post Office or large volume customer	New York	10001	36061
10122	Post Office or large volume customer	New York	10001	36061
10123	Post Office or large volume customer	New York	10001	36061

In [8]:

#ZIP Codes in local area
zip2zcta.shape

Out[8]:

(313, 4)

API Call and Processing¶

Request the ZBP data from the Census Bureau for the state, join the ZBP data to the local ZCTA to ZIP table based on ZIP Code, and group the data by ZCTA

In [9]:

with open(keyfile) as key:
    api_key=key.read().strip()

In [10]:

base_url = f'https://api.census.gov/data/{year}/{dsource}'
base_url

Out[10]:

'https://api.census.gov/data/2018/zbp'

ZBP Employment Data¶

This data is requested in a series of chunks which contain multiple ZIP Codes - do not rerun the requests block if retrieval is successful but subsequent notebook changes are needed. Proceed to the next block and pull data from json dump file.

In [11]:

def chunks(l, n):
    # For item i in a range that is a length of l,
    for i in range(0, len(l), n):
        # Create an index range for l of n items:
        yield l[i:i+n]

In [12]:

reqzips=list(chunks(zip2zcta.index.tolist(),48))
print('Number of chunks:',len(reqzips))

Number of chunks: 7

THIS BLOCK IS A REQUESTS BLOCK!¶

In [13]:

#Code 200 = success, do not rerun this block unless it's necessary
emp_data=[]
for i, v in enumerate (reqzips):
    batchzips=','.join(v)
    edata_url = f'{base_url}?get={ecols}&EMPSZES=001&for=zipcode:{batchzips}&key={api_key}'
    response=requests.get(edata_url)
    if response.status_code==200:
        clear_output(wait=True)
        data=response.json()
        if i == 0:    
            for record in data:
                emp_data.append(record)
        else:
            for record in data[1:]:
                emp_data.append(record) 
        print('Retrieved data for chunk',i)
    else:
        print('***Problem with retrieval***, response code',response.status_code)
        break
with open(ejsonpath, 'w') as f:
    json.dump(emp_data, f)
print('Done - Data dumped to json file')

Retrieved data for chunk 6
Done - Data dumped to json file

In [14]:

with open(ejsonpath, 'r') as f:
    ejsondata=json.load(f)
zbpemp=pd.DataFrame(ejsondata[1:], columns=ejsondata[0]).rename(columns={'PAYQTR1':'PAYQ1','PAYANN':'PAYAN','zip code':'zipcode'}).set_index('zipcode')
for field in zbpemp.columns:
    zbpemp=zbpemp.astype(dtype={field:'int64'})
zbpemp.drop(columns=['EMPSZES'],inplace=True)
zbpemp.head()

Out[14]:

	ESTAB	EMP	PAYQ1	PAYAN
zipcode
10118	326	6358	202796	815737
10120	70	1378	41023	132076
10121	69	6104	199451	887899
10122	177	1917	40477	156824
10123	180	2036	51353	195513

In [15]:

#ZIP Codes retrieved - may differ from zip2zcta as some zips have no businesses
zbpemp.shape

Out[15]:

(266, 4)

In [16]:

#Flag columns count the number of establishments for which data is not disclosed
flags=['FLAG_EMP','FLAG_PAYQ1','FLAG_PAYAN']
for flagcol in flags:
    datacol=flagcol.split('_')[1]
    zbpemp[flagcol]=0
    zbpemp.loc[zbpemp[datacol] == 0, flagcol] = zbpemp['ESTAB']
zbpemp.head()

Out[16]:

	ESTAB	EMP	PAYQ1	PAYAN	FLAG_EMP	FLAG_PAYQ1	FLAG_PAYAN
zipcode
10118	326	6358	202796	815737	0	0	0
10120	70	1378	41023	132076	0	0	0
10121	69	6104	199451	887899	0	0	0
10122	177	1917	40477	156824	0	0	0
10123	180	2036	51353	195513	0	0	0

In [17]:

#Join to ZIP ZCTA crosswalk
zbpemp2zcta = pd.merge(zip2zcta,zbpemp,how='inner',left_index=True,right_index=True)
zbpemp2zcta.index.name = 'ZIP_CODE'
zbpemp2zcta.head()

Out[17]:

	ZIP_TYPE	PO_NAME	zcta5	county14	ESTAB	EMP	PAYQ1	PAYAN	FLAG_EMP	FLAG_PAYQ1	FLAG_PAYAN
ZIP_CODE
10001	Zip Code Area	New York	10001	36061	7248	151769	2717186	10646611	0	0	0
10118	Post Office or large volume customer	New York	10001	36061	326	6358	202796	815737	0	0	0
10120	Post Office or large volume customer	New York	10001	36061	70	1378	41023	132076	0	0	0
10122	Post Office or large volume customer	New York	10001	36061	177	1917	40477	156824	0	0	0
10123	Post Office or large volume customer	New York	10001	36061	180	2036	51353	195513	0	0	0

In [18]:

#ZIP codes in the local area that appear in the ZBP data
zbpemp2zcta.shape

Out[18]:

(266, 11)

In [19]:

#Aggregate to ZCTAs
zctaemp=zbpemp2zcta[['zcta5','ESTAB','EMP','PAYQ1','PAYAN','FLAG_EMP','FLAG_PAYQ1','FLAG_PAYAN']].groupby(['zcta5'])[['ESTAB','EMP','PAYQ1','PAYAN','FLAG_EMP','FLAG_PAYQ1','FLAG_PAYAN']].sum()
zctaemp.head()

Out[19]:

	ESTAB	EMP	PAYQ1	PAYAN	FLAG_EMP	FLAG_PAYQ1	FLAG_PAYAN
zcta5
10001	8006	163513	3053219	11948301	0	0	0
10002	2962	22737	176288	764720	0	0	0
10003	4273	98742	1830369	6916902	0	0	0
10004	1660	69025	2337956	7349828	0	0	0
10005	1397	48129	3043620	7586070	0	0	0

ZBP Industry Data¶

This data must be requested one record at a time. NOTE that this request can take a LONG TIME, up to one hour to complete for approx 300 ZIP Codes. Once the request is finished the data gets dumped into a json file. If the request is successful but subsequent blocks need to be modified, don't rerun the requests block - pull the data from the json file.

In [20]:

zipcodes=zbpemp2zcta.index.tolist()
len(zipcodes)

Out[20]:

THIS BLOCK IS A REQUESTS BLOCK!¶

Retrieving approx 300 ZIP Codes takes 1 hour

NOTE - revise in the future to retrieve chunks of zip codes

In [21]:

#If this block is successful but there are subsequent problems, do not rerun it - start from the following block.
#For industry data, if there are no records for an industry create a blank record with zeros
n=0
z=0
ind_data=[['estab','naics','zipcode']]
for zcode in zipcodes:
#for zcode in zipcodes[0:5]:
    clear_output(wait=True)
    for naics in ncodes:
        idata_url = f'{base_url}?get={icols}&NAICS2017={naics}&for=zipcode:{zcode}&key={api_key}'
        try:
            response=requests.get(idata_url)
        except requests.exceptions.RequestException as e:
            print (e)
            break
        if response.status_code==200:
            jsondata=response.json()
            ind_data.append(jsondata[1]) 
            n=n+1
        elif response.status_code==204:
            record=['0',naics,zcode]
            ind_data.append(record)
            n=n+1
        else:
            print('Problem retrieving data, status code:',response.status_code)
            break
    z=z+1
    print(n,'records have been retrieved for',z,'ZIP codes...')
print('Done')

with open(ijsonpath, 'w') as f:
    json.dump(ind_data, f)
print('Data dumped to json file')

5586 records have been retrieved for 266 ZIP codes...
Done
Data dumped to json file

In [22]:

with open(ijsonpath, 'r') as f:
    ijsondata=json.load(f)
zbpind = pd.DataFrame(ijsondata[1:],columns=ijsondata[0])
zbpind['estab']=zbpind['estab'].astype('int64')
zbpind.head()

Out[22]:

	estab	naics	zipcode
0	7248	00	10001
1	0	11	10001
2	0	21	10001
3	5	22	10001
4	262	23	10001

In [23]:

#Pivot data to move NAICS to columns
zbpind_tab=zbpind.pivot(index='zipcode', columns='naics', values='estab')
zbpind_tab=zbpind_tab.add_prefix('N')
zbpind_tab.rename(columns=lambda x: x.replace('-', '_'),inplace=True)
zbpind_tab.head()

Out[23]:

naics	N00	N11	N21	N22	N23	N31_33	N42	N44_45	N48_49	N51	...	N53	N54	N55	N56	N61	N62	N71	N72	N81	N99
zipcode
10001	7248	0	0	5	262	188	1079	654	62	418	...	500	1453	153	363	161	405	313	528	506	9
10002	2962	0	0	0	100	60	167	520	46	66	...	274	323	3	88	27	206	100	607	314	3
10003	4268	0	0	0	51	35	94	433	5	195	...	372	672	25	120	75	342	561	658	504	8
10004	1600	0	0	0	34	13	68	65	20	109	...	87	483	12	84	34	85	31	131	134	4
10005	1346	0	0	0	26	6	39	43	11	72	...	78	421	25	74	21	41	30	81	102	3

5 rows × 21 columns

In [24]:

#Create column to summarize businesses that were not disclosed and categorized, but that are included in the total
zbpind_tab['NXX']=zbpind_tab.loc[:,'N00'].subtract(zbpind_tab.loc[:,'N11':'N99'].sum(axis=1))
zbpind_tab.head()

Out[24]:

naics	N00	N11	N21	N22	N23	N31_33	N42	N44_45	N48_49	N51	...	N54	N55	N56	N61	N62	N71	N72	N81	N99	NXX
zipcode
10001	7248	0	0	5	262	188	1079	654	62	418	...	1453	153	363	161	405	313	528	506	9	0
10002	2962	0	0	0	100	60	167	520	46	66	...	323	3	88	27	206	100	607	314	3	2
10003	4268	0	0	0	51	35	94	433	5	195	...	672	25	120	75	342	561	658	504	8	2
10004	1600	0	0	0	34	13	68	65	20	109	...	483	12	84	34	85	31	131	134	4	1
10005	1346	0	0	0	26	6	39	43	11	72	...	421	25	74	21	41	30	81	102	3	0

5 rows × 22 columns

In [25]:

#Join to ZIP ZCTA crosswalk
zbpind2zcta = pd.merge(zip2zcta[['zcta5']],zbpind_tab,how='inner',left_index=True,right_index=True)
zbpind2zcta.index.name = 'ZIP_CODE'
zbpind2zcta.head()

Out[25]:

	zcta5	N00	N11	N21	N22	N23	N31_33	N42	N44_45	N48_49	...	N54	N55	N56	N61	N62	N71	N72	N81	N99	NXX
ZIP_CODE
10001	10001	7248	0	0	5	262	188	1079	654	62	...	1453	153	363	161	405	313	528	506	9	0
10118	10001	326	0	0	0	8	0	45	10	5	...	103	12	17	0	8	29	11	23	0	5
10120	10001	70	0	0	0	0	0	8	8	0	...	18	10	0	0	3	0	0	0	0	11
10122	10001	177	0	0	0	0	0	10	3	0	...	45	0	7	0	5	53	0	10	0	3
10123	10001	180	0	0	0	7	0	14	4	0	...	77	0	13	7	11	3	0	8	0	6

5 rows × 23 columns

In [26]:

zbpind2zcta.shape

Out[26]:

(266, 23)

In [27]:

#Aggregate to ZCTAs
zctaind=zbpind2zcta.groupby(['zcta5']).sum()
zctaind.head()

Out[27]:

	N00	N11	N21	N22	N23	N31_33	N42	N44_45	N48_49	N51	...	N54	N55	N56	N61	N62	N71	N72	N81	N99	NXX
zcta5
10001	8006	0	0	5	277	188	1156	679	67	450	...	1696	175	400	168	432	398	539	547	9	30
10002	2962	0	0	0	100	60	167	520	46	66	...	323	3	88	27	206	100	607	314	3	2
10003	4273	0	0	0	51	35	94	433	5	195	...	672	25	120	75	342	561	658	504	8	7
10004	1660	0	0	0	34	13	68	65	20	126	...	491	12	88	34	85	31	131	137	4	12
10005	1397	0	0	0	26	6	39	43	11	72	...	421	25	74	21	41	30	81	102	3	13

5 rows × 22 columns

In [28]:

#Generate and calculate percent total columns
ncols=list(zctaind)
for c in ncols[1:]:
    pct=c+'_PCT'
    zctaind[pct]=((zctaind[c]/zctaind['N00'])*100).round(2)
zctaind.head()

Out[28]:

	N00	N11	N21	N22	N23	N31_33	N42	N44_45	N48_49	N51	...	N54_PCT	N55_PCT	N56_PCT	N61_PCT	N62_PCT	N71_PCT	N72_PCT	N81_PCT	N99_PCT	NXX_PCT
zcta5
10001	8006	0	0	5	277	188	1156	679	67	450	...	21.18	2.19	5.00	2.10	5.40	4.97	6.73	6.83	0.11	0.37
10002	2962	0	0	0	100	60	167	520	46	66	...	10.90	0.10	2.97	0.91	6.95	3.38	20.49	10.60	0.10	0.07
10003	4273	0	0	0	51	35	94	433	5	195	...	15.73	0.59	2.81	1.76	8.00	13.13	15.40	11.79	0.19	0.16
10004	1660	0	0	0	34	13	68	65	20	126	...	29.58	0.72	5.30	2.05	5.12	1.87	7.89	8.25	0.24	0.72
10005	1397	0	0	0	26	6	39	43	11	72	...	30.14	1.79	5.30	1.50	2.93	2.15	5.80	7.30	0.21	0.93

5 rows × 43 columns

NAICS Codes¶

THIS BLOCK IS A REQUESTS BLOCK!¶

In [29]:

codedict={}
codes_url=f'https://api.census.gov/data/2018/zbp/variables/NAICS2017.json'
response=requests.get(codes_url)
codes_data=response.json()
codedict.update(codes_data['values']['item'])
sectordict=dict((k, codedict[k]) for k in ncodes)
sectordict['XX']='Establishments omitted from classification due to privacy regulations'

In [30]:

codes=pd.DataFrame(list(sectordict.items()), columns=['naics', 'name']).set_index('naics')
codes

Out[30]:

	name
naics
00	Total for all sectors
11	Agriculture, forestry, fishing and hunting
21	Mining, quarrying, and oil and gas extraction
22	Utilities
23	Construction
31-33	Manufacturing
42	Wholesale trade
44-45	Retail trade
48-49	Transportation and warehousing
51	Information
52	Finance and insurance
53	Real estate and rental and leasing
54	Professional, scientific, and technical services
55	Management of companies and enterprises
56	Administrative and support and waste managemen...
61	Educational services
62	Health care and social assistance
71	Arts, entertainment, and recreation
72	Accommodation and food services
81	Other services (except public administration)
99	Industries not classified
XX	Establishments omitted from classification due...

Quality Control Checks¶

In [31]:

#Does sum of industries equal industry total?
indsum=zctaind['N00'].subtract(zctaind.iloc[:,1:22].sum(axis=1))
if indsum.sum()==0:
    print (True)
else:
    print(indsum.loc[indsum != 0])

True

In [32]:

#Is sum of percent totals approximately 100?
ptotal=zctaind.iloc[:,22:].sum(axis=1)
if ptotal.loc[(ptotal <= 99.05) | (ptotal >= 100.05)].empty:
    print(True)
else:
    print(ptotal.loc[(ptotal <= 99.05) | (ptotal >= 100.05)])

True

In [33]:

#Do number of ZCTAs in employment table match the industries table?
ecount=zctaemp.shape[0]
icount=zctaind.shape[0]
if ecount == icount:
    print (True)
else:
    print('Mistmatched count between employment',ecount, 'rows and industry',icount, 'rows')

True

In [34]:

#Does sum of estabslishments from employment table equal establishments in industries table?
estsum=zctaemp['ESTAB'].subtract(zctaind['N00'])
if estsum.sum()==0:
    print (True)
else:
    print(estsum.loc[estsum != 0])

True

Write to Database¶

In [35]:

con = sqlite3.connect(dbname) 
cur = con.cursor()

In [36]:

#Employment table
cur.execute('DROP TABLE IF EXISTS {};'.format(emptable))
qcreate_emptab="""
CREATE TABLE {}(
zcta5 TEXT NOT NULL PRIMARY KEY,
estab INTEGER,
emp INTEGER,
payq1 INTEGER,
payan INTEGER,
flag_emp INTEGER,
flag_payq1 INTEGER,
flag_payan INTEGER);
""".format(emptable)

cur.execute(qcreate_emptab)

Out[36]:

<sqlite3.Cursor at 0x7fac37579570>

In [37]:

#Don't run this block unless you've run the previous one
zctaemp.to_sql(name='{}'.format(emptable), if_exists='append', index=True, con=con)

In [38]:

cur.execute('SELECT COUNT(*) FROM {};'.format(emptable))
rows = cur.fetchone()
print(rows[0], 'records written to', emptable)

212 records written to zbp2018emp

In [39]:

#Replace zeros with nulls, as these values really represent no data
for col in zctaemp.columns[1:]:
    qupdate='UPDATE {} SET {} = NULL WHERE {} = 0;'.format(emptable,col,col)
    cur.execute(qupdate)
    print(cur.rowcount,'records updated for',col)
    con.commit()

0 records updated for EMP
0 records updated for PAYQ1
0 records updated for PAYAN
212 records updated for FLAG_EMP
212 records updated for FLAG_PAYQ1
212 records updated for FLAG_PAYAN

In [40]:

#Industry table
cur.execute('DROP TABLE IF EXISTS {}'.format(indtable))
qcreate_indtab="""
CREATE TABLE {} (
zcta5 TEXT NOT NULL PRIMARY KEY, 
N00 INTEGER, 
N11 INTEGER, 
N21 INTEGER, 
N22 INTEGER, 
N23 INTEGER, 
N31_33 INTEGER, 
N42 INTEGER, 
N44_45 INTEGER, 
N48_49 INTEGER, 
N51 INTEGER, 
N52 INTEGER, 
N53 INTEGER, 
N54 INTEGER, 
N55 INTEGER, 
N56 INTEGER, 
N61 INTEGER, 
N62 INTEGER, 
N71 INTEGER, 
N72 INTEGER, 
N81 INTEGER, 
N99 INTEGER,
NXX INTEGER,
N11_PCT REAL, 
N21_PCT REAL, 
N22_PCT REAL, 
N23_PCT REAL, 
N31_33_PCT REAL, 
N42_PCT REAL, 
N44_45_PCT REAL, 
N48_49_PCT REAL, 
N51_PCT REAL, 
N52_PCT REAL, 
N53_PCT REAL, 
N54_PCT REAL, 
N55_PCT REAL, 
N56_PCT REAL, 
N61_PCT REAL, 
N62_PCT REAL, 
N71_PCT REAL, 
N72_PCT REAL, 
N81_PCT REAL, 
N99_PCT REAL,
NXX_PCT REAL);
""".format(indtable)

cur.execute(qcreate_indtab)

Out[40]:

<sqlite3.Cursor at 0x7fac37579570>

In [41]:

#Don't run this block unless you've run the previous one
zctaind.to_sql(name='{}'.format(indtable), if_exists='append', index=True, con=con)

In [42]:

cur.execute('SELECT COUNT(*) FROM {};'.format(indtable))
rows = cur.fetchone()
print(rows[0], 'records written to', indtable)

212 records written to zbp2018ind

In [43]:

#For percentages, replace zeros with nulls, as these values really represent no data
for col in zctaind.columns[22:]:
    qupdate='UPDATE {} SET {} = NULL WHERE {} = 0.0;'.format(indtable,col,col)
    cur.execute(qupdate)
    print(cur.rowcount,'records updated for',col)
    con.commit()

212 records updated for N11_PCT
211 records updated for N21_PCT
201 records updated for N22_PCT
27 records updated for N23_PCT
58 records updated for N31_33_PCT
26 records updated for N42_PCT
18 records updated for N44_45_PCT
35 records updated for N48_49_PCT
39 records updated for N51_PCT
12 records updated for N52_PCT
10 records updated for N53_PCT
7 records updated for N54_PCT
140 records updated for N55_PCT
12 records updated for N56_PCT
39 records updated for N61_PCT
16 records updated for N62_PCT
46 records updated for N71_PCT
17 records updated for N72_PCT
8 records updated for N81_PCT
158 records updated for N99_PCT
19 records updated for NXX_PCT

In [44]:

#For establishments, replace zeros with nulls unless establishments were omitted from classification
for col in zctaind.columns[1:22]:
    qupdate='UPDATE {} SET {} = NULL WHERE {} = 0 AND NXX !=0;'.format(indtable,col,col)
    cur.execute(qupdate)
    print(cur.rowcount,'records updated for',col)
    con.commit()

193 records updated for N11
192 records updated for N21
184 records updated for N22
27 records updated for N23
58 records updated for N31_33
26 records updated for N42
18 records updated for N44_45
35 records updated for N48_49
39 records updated for N51
12 records updated for N52
10 records updated for N53
7 records updated for N54
132 records updated for N55
12 records updated for N56
39 records updated for N61
16 records updated for N62
46 records updated for N71
17 records updated for N72
8 records updated for N81
147 records updated for N99
0 records updated for NXX

In [45]:

#NAICS code table
cur.execute('DROP TABLE IF EXISTS {};'.format(codetable))
qcreate_codetab="""
CREATE TABLE {}(
naics TEXT NOT NULL PRIMARY KEY,
name TEXT);
""".format(codetable)

cur.execute(qcreate_codetab)

Out[45]:

<sqlite3.Cursor at 0x7fac37579570>

In [46]:

#Don't run this block unless you've run the previous one
codes.to_sql(name='{}'.format(codetable), if_exists='append', index=True, con=con)

In [47]:

cur.execute('SELECT COUNT(*) FROM {};'.format(codetable))
rows = cur.fetchone()
print(rows[0], 'records written to', codetable)

22 records written to zbp2018indcodes

In [48]:

con.close()

In [ ]:

ZBP to ZCTA¶

Variables¶

Crosswalking¶

API Call and Processing¶

ZBP Employment Data¶

*THIS BLOCK IS A REQUESTS BLOCK!*¶

ZBP Industry Data¶

*THIS BLOCK IS A REQUESTS BLOCK!*¶

NAICS Codes¶

*THIS BLOCK IS A REQUESTS BLOCK!*¶

Quality Control Checks¶

Write to Database¶

THIS BLOCK IS A REQUESTS BLOCK!¶

THIS BLOCK IS A REQUESTS BLOCK!¶

THIS BLOCK IS A REQUESTS BLOCK!¶