ZBP to ZCTA

Retrieves data from the Census Bureau's ZIP Code Business Patterns API for a specific area and summarizes it by ZCTA. Three tables are generated: one for employees that contains employment, establishments, and wages, one for industries that contains counts of establishments by 2-digit sector NAICS codes, and one reference table that correlates sector numbers and names. Initial data retrieved from API is written to json, final output is written to a SQLite database.

Confirmed to work with the 2017 ZBP series

https://www.census.gov/data/developers/data-sets/cbp-nonemp-zbp/zbp-api.html

Variables

In [1]:
import pandas as pd, requests, sqlite3, os, json
from IPython.display import clear_output
In [2]:
#Crosswalk files - update only if necessary
uszips_file='Zip_to_ZCTA_crosswalk_2015_JSI.csv'
zcta_file='geocorr14_modified.csv'

uszips_path=os.path.join('inputs',uszips_file)
zcta_path=zcta_file=os.path.join('inputs',zcta_file)

#Dump files for api data storage
ejsonpath=os.path.join('outputs', 'emp_data.json')
ijsonpath=os.path.join('outputs', 'ind_data.json')
cjsonpath=os.path.join('outputs', 'codes_data.json')

#API variables - UPDATE THE YEAR
keyfile='census_key.txt'

year='2017'
dsource='zbp'
state='36'
ecols='ESTAB,EMP,PAYQTR1,PAYANN'
icols='ESTAB'
ncodes=['00','11','21','22','23','31-33','42','44-45','48-49',
        '51','52','53','54','55','56','61','62','71','72','81',
        '99']

#SQL output - UPDATE EACH TABLE NAME
dbname=os.path.join('outputs','testdb.sqlite')
emptable='zbp2017emp'
indtable='zbp2017ind'
codetable='zbp2017indcodes'

Crosswalking

Read in the files that relate US ZIP codes to ZCTAs (from JSI) and ZCTAs to counties for local area (from MCDC Geocorr), then join them by ZIP Code to create a ZCTA to ZIP table for the local area

In [3]:
uszips=pd.read_csv(uszips_path, sep=',', dtype={'ZIP':str, 'ZCTA':str})
uszips.head()
Out[3]:
ZIP PO_NAME STATE ZIP_TYPE ZCTA
0 96916 Merizo GU Post Office or large volume customer 96916
1 96917 Inarajan GU Post Office or large volume customer 96917
2 96928 Agat GU Post Office or large volume customer 96928
3 96915 Santa Rita GU ZIP Code area 96915
4 96923 Mangilao GU Post Office or large volume customer 96913
In [4]:
#All ZIP Codes in US
uszips.shape
Out[4]:
(41270, 5)
In [5]:
zcta=pd.read_csv(zcta_path, sep=',', dtype={'zcta5':str, 'county14':str})
zcta.head()
Out[5]:
zcta5 county14 cntyname2 zipname pop10 afact
0 10001 36061 New York NY New York, NY 21102 1.0
1 10002 36061 New York NY New York, NY 81410 1.0
2 10003 36061 New York NY New York, NY 56024 1.0
3 10004 36061 New York NY New York, NY 3089 1.0
4 10005 36061 New York NY New York, NY 7135 1.0
In [6]:
#ZCTAs in local area
zcta.shape
Out[6]:
(214, 6)
In [7]:
#Merge ZIP Codes with ZCTAs for local area
zip2zcta = pd.merge(uszips[['ZIP','ZIP_TYPE','PO_NAME','ZCTA']],zcta[['zcta5','county14']],how='right', 
                    left_on='ZCTA', right_on='zcta5').set_index('ZIP')
zip2zcta.drop(columns=['ZCTA'],inplace=True)
zip2zcta.head()
Out[7]:
ZIP_TYPE PO_NAME zcta5 county14
ZIP
10307 ZIP Code area Staten Island 10307 36085
10309 ZIP Code area Staten Island 10309 36085
10312 ZIP Code area Staten Island 10312 36085
10308 ZIP Code area Staten Island 10308 36085
11697 ZIP Code area Breezy Point 11697 36081
In [8]:
#ZIP Codes in local area
zip2zcta.shape
Out[8]:
(317, 4)

API Call and Processing

Request the ZBP data from the Census Bureau for the state, join the ZBP data to the local ZCTA to ZIP table based on ZIP Code, and group the data by ZCTA

In [9]:
with open(keyfile) as key:
    api_key=key.read().strip()
In [10]:
base_url = f'https://api.census.gov/data/{year}/{dsource}'
base_url
Out[10]:
'https://api.census.gov/data/2017/zbp'

ZBP Employment Data

This data is requested in a series of chunks which contain multiple ZIP Codes - do not rerun the requests block if retrieval is successful but subsequent notebook changes are needed. Proceed to the next block and pull data from json dump file.

In [11]:
def chunks(l, n):
    # For item i in a range that is a length of l,
    for i in range(0, len(l), n):
        # Create an index range for l of n items:
        yield l[i:i+n]
In [12]:
reqzips=list(chunks(zip2zcta.index.tolist(),48))
print('Number of chunks:',len(reqzips))
Number of chunks: 7

THIS BLOCK IS A REQUESTS BLOCK!

In [13]:
#Code 200 = success, do not rerun this block unless it's necessary
emp_data=[]
for i, v in enumerate (reqzips):
    batchzips=','.join(v)
    edata_url = f'{base_url}?get={ecols}&for=zipcode:{batchzips}&key={api_key}'
    response=requests.get(edata_url)
    if response.status_code==200:
        clear_output(wait=True)
        data=response.json()
        if i == 0:    
            for record in data:
                emp_data.append(record)
        else:
            for record in data[1:]:
                emp_data.append(record) 
        print('Retrieved data for chunk',i)
    else:
        print('***Problem with retrieval***, response code',response.status_code)
        break
with open(ejsonpath, 'w') as f:
    json.dump(emp_data, f)
print('Done - Data dumped to json file')
Retrieved data for chunk 6
Done - Data dumped to json file
In [14]:
with open(ejsonpath, 'r') as f:
    ejsondata=json.load(f)
zbpemp=pd.DataFrame(ejsondata[1:], columns=ejsondata[0]).rename(columns={'PAYQTR1':'PAYQ1','PAYANN':'PAYAN','zip code':'zipcode'}).set_index('zipcode')
for field in zbpemp.columns:
    zbpemp=zbpemp.astype(dtype={field:'int64'},inplace=True)
zbpemp.head()
Out[14]:
ESTAB EMP PAYQ1 PAYAN
zipcode
11203 898 17033 237582 981590
11204 2324 14238 100430 440325
11208 907 8919 81093 343735
11209 1990 15154 153475 635170
11210 1092 10223 81784 343931
In [15]:
#ZIP Codes retrieved - may differ from zip2zcta as some zips have no businesses
zbpemp.shape
Out[15]:
(266, 4)
In [16]:
#Flag columns count the number of establishments for which data is not disclosed
flags=['FLAG_EMP','FLAG_PAYQ1','FLAG_PAYAN']
for flagcol in flags:
    datacol=flagcol.split('_')[1]
    zbpemp[flagcol]=0
    zbpemp.loc[zbpemp[datacol] == 0, flagcol] = zbpemp['ESTAB']
zbpemp.head()
Out[16]:
ESTAB EMP PAYQ1 PAYAN FLAG_EMP FLAG_PAYQ1 FLAG_PAYAN
zipcode
11203 898 17033 237582 981590 0 0 0
11204 2324 14238 100430 440325 0 0 0
11208 907 8919 81093 343735 0 0 0
11209 1990 15154 153475 635170 0 0 0
11210 1092 10223 81784 343931 0 0 0
In [17]:
#Join to ZIP ZCTA crosswalk
zbpemp2zcta = pd.merge(zip2zcta,zbpemp,how='inner',left_index=True,right_index=True)
zbpemp2zcta.index.name = 'ZIP'
zbpemp2zcta.head()
Out[17]:
ZIP_TYPE PO_NAME zcta5 county14 ESTAB EMP PAYQ1 PAYAN FLAG_EMP FLAG_PAYQ1 FLAG_PAYAN
ZIP
10307 ZIP Code area Staten Island 10307 36085 290 1537 13556 58738 0 0 0
10309 ZIP Code area Staten Island 10309 36085 844 10497 100513 433544 0 0 0
10312 ZIP Code area Staten Island 10312 36085 931 7197 64050 263547 0 0 0
10308 ZIP Code area Staten Island 10308 36085 456 3021 21831 91289 0 0 0
11697 ZIP Code area Breezy Point 11697 36081 52 357 3910 19622 0 0 0
In [18]:
#ZIP codes in the local area that appear in the ZBP data
zbpemp2zcta.shape
Out[18]:
(266, 11)
In [19]:
#Aggregate to ZCTAs
zctaemp=zbpemp2zcta[['zcta5','ESTAB','EMP','PAYQ1','PAYAN','FLAG_EMP','FLAG_PAYQ1','FLAG_PAYAN']].groupby(['zcta5'])[['ESTAB','EMP','PAYQ1','PAYAN','FLAG_EMP','FLAG_PAYQ1','FLAG_PAYAN']].sum()
zctaemp.head()
Out[19]:
ESTAB EMP PAYQ1 PAYAN FLAG_EMP FLAG_PAYQ1 FLAG_PAYAN
zcta5
10001 8036 165280 2996376 11678740 0 0 0
10002 3005 22471 174234 752302 0 0 0
10003 4276 98483 1719510 6588260 0 0 0
10004 1663 66575 2101631 6854881 4 4 4
10005 1413 48968 2035298 6301744 43 43 43

ZBP Industry Data

This data must be requested one record at a time. NOTE that this request can take a LONG TIME, up to one hour to complete for approx 300 ZIP Codes. Once the request is finished the data gets dumped into a json file. If the request is successful but subsequent blocks need to be modified, don't rerun the requests block - pull the data from the json file.

In [20]:
zipcodes=zbpemp2zcta.index.tolist()
len(zipcodes)
Out[20]:
266

THIS BLOCK IS A REQUESTS BLOCK!

Retrieving approx 300 ZIP Codes takes 1 hour

NOTE - revise in the future to retrieve chunks of zip codes

In [ ]:
#If this block is successful but there are subsequent problems, do not rerun it - start from the following block.
#For industry data, if there are no records for an industry create a blank record with zeros
n=0
z=0
ind_data=[['estab','naics','zipcode']]
for zcode in zipcodes:
#for zcode in zipcodes[0:5]:
    clear_output(wait=True)
    for naics in ncodes:
        idata_url = f'{base_url}?get={icols}&NAICS2017={naics}&for=zipcode:{zcode}&key={api_key}'
        try:
            response=requests.get(idata_url)
        except requests.exceptions.RequestException as e:
            print (e)
            break
        if response.status_code==200:
            jsondata=response.json()
            ind_data.append(jsondata[1]) 
            n=n+1
        elif response.status_code==204:
            record=['0',naics,zcode]
            ind_data.append(record)
            n=n+1
        else:
            print('Problem retrieving data, status code:',response.status_code)
            break
    z=z+1
    print(n,'records have been retrieved for',z,'ZIP codes...')
print('Done')

with open(ijsonpath, 'w') as f:
    json.dump(ind_data, f)
print('Data dumped to json file')
In [21]:
with open(ijsonpath, 'r') as f:
    ijsondata=json.load(f)
zbpind = pd.DataFrame(ijsondata[1:],columns=ijsondata[0])
zbpind['estab']=zbpind['estab'].astype('int64', inplace=True)
zbpind.head()
Out[21]:
estab naics zipcode
0 290 00 10307
1 0 11 10307
2 0 21 10307
3 0 22 10307
4 62 23 10307
In [22]:
#Pivot data to move NAICS to columns
zbpind_tab=zbpind.pivot(index='zipcode', columns='naics', values='estab')
zbpind_tab=zbpind_tab.add_prefix('N')
zbpind_tab.rename(columns=lambda x: x.replace('-', '_'),inplace=True)
zbpind_tab.head()
Out[22]:
naics N00 N11 N21 N22 N23 N31_33 N42 N44_45 N48_49 N51 ... N53 N54 N55 N56 N61 N62 N71 N72 N81 N99
zipcode
10001 7277 0 0 5 258 204 1146 682 68 400 ... 479 1462 150 375 155 384 282 521 514 0
10002 3005 0 0 0 103 68 177 546 50 68 ... 273 301 0 84 31 204 101 609 326 0
10003 4272 0 0 0 50 35 100 430 6 197 ... 369 694 25 122 78 350 527 660 507 5
10004 1606 0 0 0 28 13 71 65 22 117 ... 85 483 11 87 35 93 32 131 126 4
10005 1370 0 0 0 20 5 42 44 11 68 ... 77 434 22 79 23 42 29 81 100 0

5 rows × 21 columns

In [23]:
#Create column to summarize businesses that were not disclosed and categorized, but that are included in the total
zbpind_tab['NXX']=zbpind_tab.loc[:,'N00'].subtract(zbpind_tab.loc[:,'N11':'N99'].sum(axis=1))
zbpind_tab.head()
Out[23]:
naics N00 N11 N21 N22 N23 N31_33 N42 N44_45 N48_49 N51 ... N54 N55 N56 N61 N62 N71 N72 N81 N99 NXX
zipcode
10001 7277 0 0 5 258 204 1146 682 68 400 ... 1462 150 375 155 384 282 521 514 0 2
10002 3005 0 0 0 103 68 177 546 50 68 ... 301 0 84 31 204 101 609 326 0 5
10003 4272 0 0 0 50 35 100 430 6 197 ... 694 25 122 78 350 527 660 507 5 1
10004 1606 0 0 0 28 13 71 65 22 117 ... 483 11 87 35 93 32 131 126 4 1
10005 1370 0 0 0 20 5 42 44 11 68 ... 434 22 79 23 42 29 81 100 0 3

5 rows × 22 columns

In [24]:
#Join to ZIP ZCTA crosswalk
zbpind2zcta = pd.merge(zip2zcta[['zcta5']],zbpind_tab,how='inner',left_index=True,right_index=True)
zbpind2zcta.index.name = 'ZIP'
zbpind2zcta.head()
Out[24]:
zcta5 N00 N11 N21 N22 N23 N31_33 N42 N44_45 N48_49 ... N54 N55 N56 N61 N62 N71 N72 N81 N99 NXX
ZIP
10307 10307 290 0 0 0 62 5 13 33 14 ... 19 0 26 5 18 7 32 34 0 1
10309 10309 844 0 0 0 147 17 45 108 26 ... 77 0 46 18 90 24 75 83 0 1
10312 10312 931 0 0 0 154 9 39 116 29 ... 101 0 55 22 152 12 84 95 0 1
10308 10308 456 0 0 0 66 3 13 57 14 ... 51 3 27 7 60 8 52 55 0 2
11697 11697 52 0 0 0 11 0 0 5 0 ... 6 0 4 0 5 5 6 3 0 4

5 rows × 23 columns

In [25]:
zbpind2zcta.shape
Out[25]:
(266, 23)
In [26]:
#Aggregate to ZCTAs
zctaind=zbpind2zcta.groupby(['zcta5']).sum()
zctaind.head()
Out[26]:
N00 N11 N21 N22 N23 N31_33 N42 N44_45 N48_49 N51 ... N54 N55 N56 N61 N62 N71 N72 N81 N99 NXX
zcta5
10001 8036 0 0 5 272 208 1228 707 73 435 ... 1713 173 417 160 406 355 532 553 0 32
10002 3005 0 0 0 103 68 177 546 50 68 ... 301 0 84 31 204 101 609 326 0 5
10003 4276 0 0 0 50 35 100 430 6 197 ... 694 25 122 78 350 527 660 507 5 5
10004 1663 0 0 0 28 13 71 65 22 132 ... 491 11 90 35 93 32 131 126 4 13
10005 1413 0 0 0 20 5 42 44 11 68 ... 434 22 79 23 42 29 81 100 0 15

5 rows × 22 columns

In [27]:
#Generate and calculate percent total columns
ncols=list(zctaind)
for c in ncols[1:]:
    pct=c+'_PCT'
    zctaind[pct]=((zctaind[c]/zctaind['N00'])*100).round(2)
zctaind.head()
Out[27]:
N00 N11 N21 N22 N23 N31_33 N42 N44_45 N48_49 N51 ... N54_PCT N55_PCT N56_PCT N61_PCT N62_PCT N71_PCT N72_PCT N81_PCT N99_PCT NXX_PCT
zcta5
10001 8036 0 0 5 272 208 1228 707 73 435 ... 21.32 2.15 5.19 1.99 5.05 4.42 6.62 6.88 0.00 0.40
10002 3005 0 0 0 103 68 177 546 50 68 ... 10.02 0.00 2.80 1.03 6.79 3.36 20.27 10.85 0.00 0.17
10003 4276 0 0 0 50 35 100 430 6 197 ... 16.23 0.58 2.85 1.82 8.19 12.32 15.43 11.86 0.12 0.12
10004 1663 0 0 0 28 13 71 65 22 132 ... 29.52 0.66 5.41 2.10 5.59 1.92 7.88 7.58 0.24 0.78
10005 1413 0 0 0 20 5 42 44 11 68 ... 30.71 1.56 5.59 1.63 2.97 2.05 5.73 7.08 0.00 1.06

5 rows × 43 columns

NAICS Codes

THIS BLOCK IS A REQUESTS BLOCK!

In [28]:
codedict={}
codes_url=f'https://api.census.gov/data/2017/zbp/variables/NAICS2017.json'
response=requests.get(codes_url)
codes_data=response.json()
codedict.update(codes_data['values']['item'])
sectordict=dict((k, codedict[k]) for k in ncodes)
sectordict['XX']='Establishments omitted from classification due to privacy regulations'
In [29]:
codes=pd.DataFrame(list(sectordict.items()), columns=['naics', 'name']).set_index('naics')
codes
Out[29]:
name
naics
00 Total for all sectors
11 Agriculture, forestry, fishing and hunting
21 Mining, quarrying, and oil and gas extraction
22 Utilities
23 Construction
31-33 Manufacturing
42 Wholesale trade
44-45 Retail trade
48-49 Transportation and warehousing
51 Information
52 Finance and insurance
53 Real estate and rental and leasing
54 Professional, scientific, and technical services
55 Management of companies and enterprises
56 Administrative and support and waste managemen...
61 Educational services
62 Health care and social assistance
71 Arts, entertainment, and recreation
72 Accommodation and food services
81 Other services (except public administration)
99 Industries not classified
XX Establishments omitted from classification due...

Quality Control Checks

In [30]:
#Does sum of industries equal industry total?
indsum=zctaind['N00'].subtract(zctaind.iloc[:,1:22].sum(axis=1))
if indsum.sum()==0:
    print (True)
else:
    print(indsum.loc[indsum != 0])
True
In [31]:
#Is sum of percent totals approximately 100?
ptotal=zctaind.iloc[:,22:].sum(axis=1)
if ptotal.loc[(ptotal <= 99.05) | (ptotal >= 100.05)].empty:
    print(True)
else:
    print(ptotal.loc[(ptotal <= 99.05) | (ptotal >= 100.05)])
True
In [32]:
#Do number of ZCTAs in employment table match the industries table?
ecount=zctaemp.shape[0]
icount=zctaind.shape[0]
if ecount == icount:
    print (True)
else:
    print('Mistmatched count between employment',ecount, 'rows and industry',icount, 'rows')
True
In [33]:
#Does sum of estabslishments from employment table equal establishments in industries table?
estsum=zctaemp['ESTAB'].subtract(zctaind['N00'])
if estsum.sum()==0:
    print (True)
else:
    print(estsum.loc[estsum != 0])
True

Write to Database

In [34]:
con = sqlite3.connect(dbname) 
cur = con.cursor()
In [35]:
#Employment table
cur.execute('DROP TABLE IF EXISTS {};'.format(emptable))
qcreate_emptab="""
CREATE TABLE {}(
zcta5 TEXT NOT NULL PRIMARY KEY,
estab INTEGER,
emp INTEGER,
payq1 INTEGER,
payan INTEGER,
flag_emp INTEGER,
flag_payq1 INTEGER,
flag_payan INTEGER);
""".format(emptable)

cur.execute(qcreate_emptab)
Out[35]:
<sqlite3.Cursor at 0x199d8942810>
In [36]:
#Don't run this block unless you've run the previous one
zctaemp.to_sql(name='{}'.format(emptable), if_exists='append', index=True, con=con)
In [37]:
cur.execute('SELECT COUNT(*) FROM {};'.format(emptable))
rows = cur.fetchone()
print(rows[0], 'records written to', emptable)
213 records written to zbp2017emp
In [38]:
#Replace zeros with nulls, as these values really represent no data
for col in zctaemp.columns[1:]:
    qupdate='UPDATE {} SET {} = NULL WHERE {} = 0;'.format(emptable,col,col)
    cur.execute(qupdate)
    print(cur.rowcount,'records updated for',col)
    con.commit()
2 records updated for EMP
2 records updated for PAYQ1
2 records updated for PAYAN
201 records updated for FLAG_EMP
201 records updated for FLAG_PAYQ1
201 records updated for FLAG_PAYAN
In [39]:
#Industry table
cur.execute('DROP TABLE IF EXISTS {}'.format(indtable))
qcreate_indtab="""
CREATE TABLE {} (
zcta5 TEXT NOT NULL PRIMARY KEY, 
N00 INTEGER, 
N11 INTEGER, 
N21 INTEGER, 
N22 INTEGER, 
N23 INTEGER, 
N31_33 INTEGER, 
N42 INTEGER, 
N44_45 INTEGER, 
N48_49 INTEGER, 
N51 INTEGER, 
N52 INTEGER, 
N53 INTEGER, 
N54 INTEGER, 
N55 INTEGER, 
N56 INTEGER, 
N61 INTEGER, 
N62 INTEGER, 
N71 INTEGER, 
N72 INTEGER, 
N81 INTEGER, 
N99 INTEGER,
NXX INTEGER,
N11_PCT REAL, 
N21_PCT REAL, 
N22_PCT REAL, 
N23_PCT REAL, 
N31_33_PCT REAL, 
N42_PCT REAL, 
N44_45_PCT REAL, 
N48_49_PCT REAL, 
N51_PCT REAL, 
N52_PCT REAL, 
N53_PCT REAL, 
N54_PCT REAL, 
N55_PCT REAL, 
N56_PCT REAL, 
N61_PCT REAL, 
N62_PCT REAL, 
N71_PCT REAL, 
N72_PCT REAL, 
N81_PCT REAL, 
N99_PCT REAL,
NXX_PCT REAL);
""".format(indtable)

cur.execute(qcreate_indtab)
Out[39]:
<sqlite3.Cursor at 0x199d8942810>
In [40]:
#Don't run this block unless you've run the previous one
zctaind.to_sql(name='{}'.format(indtable), if_exists='append', index=True, con=con)
In [41]:
cur.execute('SELECT COUNT(*) FROM {};'.format(indtable))
rows = cur.fetchone()
print(rows[0], 'records written to', indtable)
213 records written to zbp2017ind
In [42]:
#For percentages, replace zeros with nulls, as these values really represent no data
for col in zctaind.columns[22:]:
    qupdate='UPDATE {} SET {} = NULL WHERE {} = 0.0;'.format(indtable,col,col)
    cur.execute(qupdate)
    print(cur.rowcount,'records updated for',col)
    con.commit()
212 records updated for N11_PCT
212 records updated for N21_PCT
200 records updated for N22_PCT
28 records updated for N23_PCT
54 records updated for N31_33_PCT
25 records updated for N42_PCT
20 records updated for N44_45_PCT
39 records updated for N48_49_PCT
38 records updated for N51_PCT
11 records updated for N52_PCT
11 records updated for N53_PCT
8 records updated for N54_PCT
147 records updated for N55_PCT
14 records updated for N56_PCT
41 records updated for N61_PCT
16 records updated for N62_PCT
49 records updated for N71_PCT
17 records updated for N72_PCT
9 records updated for N81_PCT
173 records updated for N99_PCT
22 records updated for NXX_PCT
In [43]:
#For establishments, replace zeros with nulls unless establishments were omitted from classification
for col in zctaind.columns[1:22]:
    qupdate='UPDATE {} SET {} = NULL WHERE {} = 0 AND NXX !=0;'.format(indtable,col,col)
    cur.execute(qupdate)
    print(cur.rowcount,'records updated for',col)
    con.commit()
190 records updated for N11
190 records updated for N21
179 records updated for N22
28 records updated for N23
54 records updated for N31_33
25 records updated for N42
20 records updated for N44_45
39 records updated for N48_49
38 records updated for N51
11 records updated for N52
11 records updated for N53
8 records updated for N54
135 records updated for N55
14 records updated for N56
41 records updated for N61
16 records updated for N62
49 records updated for N71
17 records updated for N72
9 records updated for N81
158 records updated for N99
0 records updated for NXX
In [44]:
#NAICS code table
cur.execute('DROP TABLE IF EXISTS {};'.format(codetable))
qcreate_codetab="""
CREATE TABLE {}(
naics TEXT NOT NULL PRIMARY KEY,
name TEXT);
""".format(codetable)

cur.execute(qcreate_codetab)
Out[44]:
<sqlite3.Cursor at 0x199d8942810>
In [45]:
#Don't run this block unless you've run the previous one
codes.to_sql(name='{}'.format(codetable), if_exists='append', index=True, con=con)
In [46]:
cur.execute('SELECT COUNT(*) FROM {};'.format(codetable))
rows = cur.fetchone()
print(rows[0], 'records written to', codetable)
22 records written to zbp2017indcodes
In [47]:
con.close()
In [ ]: