Demo Python Wrapper for nomis API

An example Python wrapper for the nonis API.

Something I started exploring some time ago was the ability to generate textual reports around monthly JSA figures. At the time, I found it really tricky to navigate the nomis API in an efficient way, as for example when trying to generate URLs like the following to get the total JSA claimant count for the Isle of Wight:

In [457]:
import pandas as pd

baseURL='http://www.nomisweb.co.uk/api/v01/dataset/NM_1_1.data.csv?'
url=baseURL+'geography=2038431803&sex=5,6,7&item=1&measures=20100'
#Projection
url+='&select=sex_name,geography_name,measures_name,date_code,date_name,obs_value'
tmp=pd.read_csv(url)
tmp[:9]
Out[457]:
SEX_NAME GEOGRAPHY_NAME MEASURES_NAME DATE_CODE DATE_NAME OBS_VALUE
0 Male Isle of Wight Persons claiming JSA 1983-06 June 1983 3504
1 Female Isle of Wight Persons claiming JSA 1983-06 June 1983 1336
2 Total Isle of Wight Persons claiming JSA 1983-06 June 1983 4840
3 Male Isle of Wight Persons claiming JSA 1983-07 July 1983 3458
4 Female Isle of Wight Persons claiming JSA 1983-07 July 1983 1302
5 Total Isle of Wight Persons claiming JSA 1983-07 July 1983 4760
6 Male Isle of Wight Persons claiming JSA 1983-08 August 1983 3409
7 Female Isle of Wight Persons claiming JSA 1983-08 August 1983 1264
8 Total Isle of Wight Persons claiming JSA 1983-08 August 1983 4673

So I've started working on the follow to try to make it easier to work with this API in code...

The idea is that we should be able to have a conversation with the API to generate URLs that will bring bring back desired datasets or slices/filtered versions of particular datasets, and then retrive those datasets into a pandas dataframe so we can start to work with it.

In [458]:
import pandas as pd
import urllib
import re

class NOMIS_CONFIG:
    #TO DO implement cache to cache list of datasets and dimensions associated with datasets (except time/date?)
    
    def __init__(self):
        NOMIS_STUB='https://www.nomisweb.co.uk/api/v01/dataset/'
        
        self.url=NOMIS_STUB
        self.codes=None
        self.metadata={}

    def _url_encode(self,params=None):
        if params is not None and params!='' and params != {}:
            #params='?{}'.format( '&'.join( ['{}={}'.format(p,params[p]) for p in params] ) )
            params='?{}'.format(urllib.urlencode(params))
        else:
            params=''
        return params


    def _describe_dataset(self,df):
        for row in df.iterrows():
            dfr=row[1]
            print('{idx} - {name}: {description}\n'.format(idx=dfr['idx'],
                                                         name=dfr['name'],
                                                         description=dfr['description']) )
                                                                               
    def _describe_metadata(self,idx,df,keys,pretty=True):
        if not pretty:
            for key in keys:
                print( '---- {} ----'.format(key) )
                for row in df[key].iterrows():
                    dfr=row[1]
                    print('{dimension} - {description}: {value}'.format(dimension=dfr['dimension'],
                                                                 description=dfr['description'],
                                                                 value=dfr['value']) )
        else:
            print('The following dimensions are available for {idx} ({name}):\n'.format(
                    idx=idx, 
                    name=self.dataset_lookup_property(idx,'name')))
            for key in keys:
                items =['{} ({})'.format(row[1]['description'],row[1]['value']) for row in df[key].iterrows()]
                print( ' - {key}: {items}'.format(key=key,items=', '.join(items)) )
            
    def help_url(self,idx='NM_7_1'):
        metadata=self.nomis_code_metadata(idx)
        keys=metadata.keys()
        keys.remove('core')
        print('Dataset {idx} ({name}) supports the following dimensions: {dims}.'.format(
                idx=idx,
                dims=', '.join(keys),
                name=self.dataset_lookup_property(idx,'name')))

    def dataset_lookup_property(self,idx=None,prop=None):
        if idx is None or prop is None: return ''
        df=self.dataset_lookup(idx)

        if prop in df.columns: return str(df[prop][0])
        else: return ''
        
    def dataset_lookup(self,idx=None,dimensions=False,describe=False):
        ##dimensions used in sense of do we display them or not
        if self.codes is None:
            self.codes=self.nomis_codes_datasets(dimensions=True)
        
        if idx is not None:
            #Test if idx is a list or single string
            if isinstance(idx, str): idx=[idx]
            df=self.codes[self.codes['idx'].isin(idx)]
        else:
            df=self.codes[:]
        
        cols=df.columns.tolist() 
        if not dimensions:
            for col in ['dimension','concept']:
                cols.remove(col)
        df=df[cols].drop_duplicates().reset_index(drop=True)
        if describe: self._describe_dataset(df)
        else: return df
        
    def _get_geo_from_postcode(self, postcode, areacode=None):
        #Set a default
        if areacode is None:
            areacode='district'
            
        codemap={ 'district':486 }

        if areacode in codemap:
            areacode=codemap[areacode]
        
        return 'POSTCODE|{postcode};{code}'.format(postcode=postcode,code=areacode)

    
    def _dimension_mapper(self,idx,dim,dims):
        ''' dims is a string of comma separated values for a particular dimension '''       
        if dim is not None:
            sc=self._nomis_codes_dimension_grab(dim,idx,params=None)
            dimmap=dict(zip(sc['description'].astype(str),sc['value']))
            keys=dimmap.keys()
            keys.sort(key=len, reverse=True)
            for s in keys:
                pattern = re.compile(s, re.IGNORECASE)
                dims=pattern.sub(str(dimmap[s]), str(dims))
        return dims
        
    def _sex_map(self,idx,sex):
        return self._dimension_mapper(idx,'sex',sex)
                
    def _get_geo_code_helper(self,helper):
        value=None
        desc=None

        #I am baking values in, but maybe they should be searched for and retrieved that way?
        if helper=='UK_WPC_2010':
            #UK Westminster Parliamentary Constituency
            value='2092957697TYPE460'
        elif helper=='LA_district':
            value='2092957697TYPE464'

        return value,desc

    def get_geo_code(self,value=None,desc=None, search=None, helper=None, chase=False):
        #The semantics of this are quite tricky
        #value is a code for a geography, the thing searched within
        #desc identifies a description within a geography - on a match it takes you to this lower geography
        #search is term to search (free text search) with the descriptions of areas returned
        #helper is in place for shortcuts

        #Given a local authority code, eg 1946157281, a report can be previewed at:
        ##https://www.nomisweb.co.uk/reports/lmp/la/1946157281/report.aspx
        #default
        if helper is not None:
            value,desc=self._get_geo_code_helper(helper)
        if chase:
            chaser= self.nomis_codes_geog(geography=value)
            if search is not None:
                chasecands=chaser[ chaser['description'].str.contains(search) ][['description','value']].values
            else:
                chasecands=chaser[['description','value']].values
            locs=[]
            for chasecand in chasecands:
                locs.append(chasecand[1])
            if len(locs): value=','.join(map(str,locs))

        geog=self.nomis_codes_geog(geography=value)
        if desc is not None:
            candidates=geog[['description','value']].values
            for candidate in candidates:
                if candidate[0]==desc:
                    geog=self.nomis_codes_geog(geography=candidate[1])

        if search is not None:
            retval=geog[ geog['description'].str.contains(search) ][['description','value']].values
        else:
            retval=geog[['description','value']].values

        return pd.DataFrame(retval,columns=['description','geog'])

    def _get_datasets(self,search=None):
        url='http://www.nomisweb.co.uk/api/v01/dataset/def.sdmx.json'
        if search is not None:
            url='{url}{params}'.format(url=url,params=self._url_encode({'search':search}))
        data=pd.read_json(url)
        return data

    def nomis_code_metadata(self,idx='NM_1_1',describe=None):
        if idx in self.metadata:
            metadata=self.metadata[idx]
        else:
            core=self.dataset_lookup(idx,dimensions=True)
            metadata={'core':core}
            for dim in core['concept'].str.lower():
                metadata[dim]=self._nomis_codes_dimension_grab(dim,idx,params=None)
        self.metadata[idx]=metadata       
        if describe=='all':
            keys= metadata.keys()
            keys.remove('core')
            self._describe_metadata(idx,metadata,keys)
        elif isinstance(describe, str) and describe in metadata.keys():
            self._describe_metadata(idx,metadata,[describe])
        elif isinstance(describe, list):
            self._describe_metadata(idx,metadata,describe)
        else:
            return metadata
        
        
    def nomis_codes_datasets(self,search=None,dimensions=False):
        #TO DO - by default, use local dataset list and search in specified cols;
        #  add additional parameter to force a search on API
        
        df=self._get_datasets(search)

        keyfamilies=df.loc['keyfamilies']['structure']
        if keyfamilies is None: return pd.DataFrame()
        
        datasets=[]
        for keyfamily in keyfamilies['keyfamily']:
            kf={'agency':keyfamily['agencyid'],
                'idx':keyfamily['id'],
                'name':keyfamily['name']['value'],
                'description': keyfamily['description']['value'] if 'description' in keyfamily else ''
                #'dimensions':[dimensions['codelist'] for dimensions in keyfamily['components']['dimension']]
            }

            if dimensions:
                for _dimensions in keyfamily['components']['dimension']:
                    kf['dimension']= _dimensions['codelist']
                    kf['concept']= _dimensions['conceptref']
                    datasets.append(kf.copy())
            else:
                datasets.append(kf.copy())
                
        return pd.DataFrame(datasets)

    def _nomis_codes_parser(self,url):
        jdata=pd.read_json(url)
        cl=jdata.loc['codelists']['structure']
        if cl is None: return pd.DataFrame()
        
        codes_data=[]
        for codelist in cl['codelist']:
            code_data={'agencyid':codelist['agencyid'],
                       'dataset':jdata.loc['header']['structure']['id'],
                       'dimension':codelist['id'],
                       'name':codelist['name']['value']
                      }
            for code in codelist['code']:
                code_data['description']=code['description']['value']
                code_data['value']=code['value']
                codes_data.append(code_data.copy())
        return pd.DataFrame(codes_data)

    #Generic mininal constructor
    def _nomis_codes_url_constructor(self,dim,idx,params=None):
        #This doesn't cope with geography properly that can insert an element into the path?
        return '{nomis}{idx}/{dim}.def.sdmx.json{params}'.format(nomis=self.url,
                                                                 idx=idx,
                                                                 dim=dim.lower(),
                                                                 params=self._url_encode(params))
    def _nomis_codes_dimension_grab(self,dim,idx,params=None):
        url=self._nomis_codes_url_constructor(dim,idx,params=None)
        return self._nomis_codes_parser(url)
    
    #Set up shorthand functions to call particular dimensions
    #Select appropriate datsets as default to demo the call
    def nomis_codes_measures(self,idx='NM_1_1'):
        url=self._nomis_codes_url_constructor('measures',idx)
        return self._nomis_codes_parser(url)
 
    def nomis_codes_time(self,idx='NM_1_1'):
        url=self._nomis_codes_url_constructor('time',idx)
        return self._nomis_codes_parser(url)

    def nomis_codes_industry(self,idx='NM_21_1'):
        url=self._nomis_codes_url_constructor('industry',idx)
        return self._nomis_codes_parser(url)
    
    def nomis_codes_freq(self,idx='NM_1_1'):
        url=url=self._nomis_codes_url_constructor('freq',idx)
        return self._nomis_codes_parser(url)

    def nomis_codes_age_dur(self,idx='NM_7_1'):
        url=url=self._nomis_codes_url_constructor('age_dur',idx)
        return self._nomis_codes_parser(url)

    def nomis_codes_ethnicity(self,idx='NM_118_1'):
        url=url=self._nomis_codes_url_constructor('ethnicity',idx)
        return self._nomis_codes_parser(url)
    
    def nomis_codes_occupation(self,idx='NM_7_1'):
        url=url=self._nomis_codes_url_constructor('occupation',idx)
        return self._nomis_codes_parser(url)
    
    def nomis_codes_age(self,idx='NM_18_1'):
        url=url=self._nomis_codes_url_constructor('age',idx)
        return self._nomis_codes_parser(url)
    
    def nomis_codes_duration(self,idx='NM_18_1'):
        url=url=self._nomis_codes_url_constructor('duration',idx)
        return self._nomis_codes_parser(url)
    

    def nomis_codes_sex(self,idx='NM_1_1',geography=None):
        params={}
        if geography is not None:
            params['geography']=geography

        url='{nomis}{idx}/sex.def.sdmx.json{params}'.format(nomis=self.url,
                                                           idx=idx,
                                                           params=self._url_encode(params))

        return self._nomis_codes_parser(url)
    
    def nomis_codes_geog(self,idx='NM_1_1',geography=None,search=None):
        params={}
        if geography is not None:
            geog='/{geog}'.format(geog=geography)
        else:
            geog=''

        if search is not None:
            params['search']=search
        
        url='{nomis}{idx}/geography{geog}.def.sdmx.json{params}'.format(nomis=self.url,
                                                                       idx=idx,geog=geog,
                                                                       params=self._url_encode(params))
            
        return self._nomis_codes_parser(url)
    
    def nomis_codes_items(self,idx='NM_1_1',geography=None,sex=None):
        sex=self._sex_map(idx,sex)
        params={}

        if geography is not None:
            params['geography']=geography
        if sex is not None:
            params['sex']=sex

        url='{nomis}{idx}/item.def.sdmx.json{params}'.format(nomis=self.url,
                                                            idx=idx,
                                                            params=self._url_encode(params))

        return self._nomis_codes_parser(url)

    #TO DO have a dataset_explain(idx) function that will print a description of a dataset,
    #summarise what dimensions are available, and the value they can take,
    #and provide a stub function usage example (with eligible parameters) to call it

    def _nomis_data_url(self,idx='NM_1_1',postcode=None, areacode=None, **kwargs):

        #TO DO
        #Add an explain=True parameter that will print a natural language summary of what the command is calling
        
        
        ###---Time/date info from nomis API docs---
        #Useful time options:
        ##"latest" - the latest available data for this dataset
        ##"previous" - the date prior to "latest"
        ##"prevyear" - the date one year prior to "latest"
        ##"first" - the oldest available data for this dataset
        ##Using the "time" concept you are limited to entering two dates, 
        ##a start and end. All dates between these are returned.
        
        #date is more flexible for ranges
        ##With the "date" parameter you can specify relative dates, 
        ##so for example if you wanted the latest date, three months and six months prior to that
        ##you could specify "date=latest,latestMINUS3,latestMINUS6". 
        ##You can use ranges with the "date" parameter, 
        ##e.g. if you wanted data for 12 months ago, together with all dates in the last six month
        ##up to latest you could specify "date=prevyear,latestMINUS5-latest".
        
        ##To illustrate the difference between using "date" and "time";
        ##if you specified "time=first,latest" in your URI you would get all dates from first to latest inclusive,
        ##whereas with "date=first,latest" your output would contain only the first and latest dates.
 
        metadata=self.nomis_code_metadata(idx)
    
        #HELPERS
    
        #Find geography from postcode
        if 'geography' not in kwargs and postcode is not None:
            kwargs['geography']=self._get_geo_from_postcode(postcode, areacode)

        #Map natural language dimension values to corresponding codes
        for dim in set( metadata.keys() ).intersection( kwargs.keys() ):
            kwargs[dim]=self._dimension_mapper(idx,dim,kwargs[dim])
        
        #Set a default time period to be latest
        if 'date' not in kwargs and 'time' not in kwargs:
            kwargs['time']='latest'

        
        #Set up a default projection for the returned columns
        cols=['geography_code','geography_name','measures_name','measures','date_code','date_name','obs_value']

        for k in ['sex','age','item']:
            if k in kwargs: cols.insert(len(cols)-1,'{}_name'.format(k))
        
        if 'select' not in kwargs:
            kwargs['select']=','.join(cols)
        
        url='{nomis}{idx}.data.csv{params}'.format(nomis=self.url,
                                                  idx=idx,
                                                  params=self._url_encode(kwargs))
        return url
    
    def _nomis_data(self,idx='NM_1_1',postcode=None, areacode=None, **kwargs):
        url=self._nomis_data_url(idx,postcode, areacode, **kwargs)

        df=pd.read_csv(url)
        df['_Code']=idx
        return df

Usage

To start with, we create a NOMIS_CONFIG() object. This doesn't really contain anything to start with except for a bunch of methods...

The first time we properly call on it, however, there is likely to be a delay as various bits get seeded into it...

In [459]:
nomis=NOMIS_CONFIG()

dataset_lookup( idx | [idx, ... ], describe = False )

One of the first things we might want to do is to look up some basic information about a particular dataset, such as its name and a brief description of it. The first time we call this, the object grabs a list of all the datasets, so it may take some time.

In [460]:
nm_1_1_info=nomis.dataset_lookup('NM_1_1')
nm_1_1_info
Out[460]:
agency description idx name
0 NOMIS JSA claimant count records the number of peopl... NM_1_1 claimant count with rates and proportions

Many of the function calls take a dataset identifier as the first parameter (idx). Rather than break functions that donlt receive a dataset identifier, I have tried to put in a dummy value that returns an example result. It should be clear from the returned data which dataset it relates to (the dataset identifier value should be clearly visible in the response).

One exception is in the nomis.dataset_lookup() function - if we don't query a particular dataset here, we see the whole listing.

In [461]:
nomis.dataset_lookup().head()
Out[461]:
agency description idx name
0 NOMIS JSA claimant count records the number of peopl... NM_1_1 claimant count with rates and proportions
1 NOMIS A quartery count of claimants who were claimin... NM_2_1 claimant count - age and duration
2 NOMIS A monthly count of job seekers allowance (JSA)... NM_4_1 claimant count - age and duration
3 NOMIS A midyear estimate of the workforce (the denom... NM_5_1 claimant count denominators - historical workf...
4 NOMIS A quarterly count of job seekers allowance cl... NM_6_1 claimant count - occupation

We can also view the a table summarising a single dataset, passed as a string, or several datasets, passed as a list, as in this example:

In [462]:
nomis.dataset_lookup(['NM_1_1','NM_7_1','NM_18_1','NM_31_1'])
Out[462]:
agency description idx name
0 NOMIS JSA claimant count records the number of peopl... NM_1_1 claimant count with rates and proportions
1 NOMIS A quarterly count of job seekers allowance cl... NM_7_1 claimant count - occupation, age and duration
2 NOMIS A monthly count of job seekers allowance (JSA)... NM_18_1 claimant count - age duration with proportions
3 NOMIS The midyear (30 June) estimates of population ... NM_31_1 mid-year population estimates

Alternatively, we can choose to print out the whole description by setting describe=True.

In [463]:
nomis.dataset_lookup(['NM_1_1','NM_7_1','NM_18_1','NM_31_1'],describe=True)
NM_1_1 - claimant count with rates and proportions: JSA claimant count records the number of people claiming Jobseekers Allowance (JSA) and National Insurance credits at Jobcentre Plus local offices. This is not an official measure of unemployment, but is the only indicative statistic available for areas smaller than Local Authorities.

NM_7_1 - claimant count - occupation, age and duration: A quarterly count of  job seekers allowance claimants analysed by their sought and usual occupation, their age and the duration of their claim.

NM_18_1 - claimant count - age duration with proportions: A monthly count of job seekers allowance (JSA) claimants broken down by age and duration of claim together with age based proportions. Totals exclude non-computerised clerical claims (approx. 1%). Available for Local Authorities.

NM_31_1 - mid-year population estimates: The midyear (30 June) estimates of population are based on results from the latest Census of Population with allowance for under-enumeration. Available at Local Authority level and above.

help_url( idx, dimensions = False)

If you need help with what parameters to add to a URL:

In [464]:
nomis.help_url(idx='NM_7_1')
Dataset NM_7_1 (claimant count - occupation, age and duration) supports the following dimensions: measures, sex, item, age_dur, freq, geography, occupation.

dataset_lookup( idx, dimensions = False )

There is actually a more comprehensive view of a datasets available that contains some metadata columns (concept and dimension) that describe what filter dimensions are available over the dataset.

In [465]:
nomis.dataset_lookup('NM_1_1',dimensions=True)
Out[465]:
agency concept description dimension idx name
0 NOMIS GEOGRAPHY JSA claimant count records the number of peopl... CL_1_1_GEOGRAPHY NM_1_1 claimant count with rates and proportions
1 NOMIS SEX JSA claimant count records the number of peopl... CL_1_1_SEX NM_1_1 claimant count with rates and proportions
2 NOMIS ITEM JSA claimant count records the number of peopl... CL_1_1_ITEM NM_1_1 claimant count with rates and proportions
3 NOMIS MEASURES JSA claimant count records the number of peopl... CL_1_1_MEASURES NM_1_1 claimant count with rates and proportions
4 NOMIS FREQ JSA claimant count records the number of peopl... CL_1_1_FREQ NM_1_1 claimant count with rates and proportions

nomis_code_metadata( idx, describe = 'all' | dimension | [dimension, ...] )

We can pull down a complete description of the levels available within each concept for a single selected dataset.

In [466]:
p=nomis.nomis_code_metadata() #a default id is provided for demo purposes; set using eg id='NM_7_1'
p
Out[466]:
{'core':   agency    concept                                        description  \
 0  NOMIS  GEOGRAPHY  JSA claimant count records the number of peopl...   
 1  NOMIS        SEX  JSA claimant count records the number of peopl...   
 2  NOMIS       ITEM  JSA claimant count records the number of peopl...   
 3  NOMIS   MEASURES  JSA claimant count records the number of peopl...   
 4  NOMIS       FREQ  JSA claimant count records the number of peopl...   
 
           dimension     idx                                       name  
 0  CL_1_1_GEOGRAPHY  NM_1_1  claimant count with rates and proportions  
 1        CL_1_1_SEX  NM_1_1  claimant count with rates and proportions  
 2       CL_1_1_ITEM  NM_1_1  claimant count with rates and proportions  
 3   CL_1_1_MEASURES  NM_1_1  claimant count with rates and proportions  
 4       CL_1_1_FREQ  NM_1_1  claimant count with rates and proportions  ,
 u'freq':   agencyid dataset            description    dimension                 name  \
 0    NOMIS  NM_1_1                Monthly  CL_1_1_FREQ  Frequency code list   
 1    NOMIS  NM_1_1              Quarterly  CL_1_1_FREQ  Frequency code list   
 2    NOMIS  NM_1_1  Half-yearly, semester  CL_1_1_FREQ  Frequency code list   
 3    NOMIS  NM_1_1               Annually  CL_1_1_FREQ  Frequency code list   
 
   value  
 0     M  
 1     Q  
 2     S  
 3     A  ,
 u'geography':   agencyid dataset        description         dimension       name       value
 0    NOMIS  NM_1_1     United Kingdom  CL_1_1_GEOGRAPHY  geography  2092957697
 1    NOMIS  NM_1_1      Great Britain  CL_1_1_GEOGRAPHY  geography  2092957698
 2    NOMIS  NM_1_1            England  CL_1_1_GEOGRAPHY  geography  2092957699
 3    NOMIS  NM_1_1              Wales  CL_1_1_GEOGRAPHY  geography  2092957700
 4    NOMIS  NM_1_1           Scotland  CL_1_1_GEOGRAPHY  geography  2092957701
 5    NOMIS  NM_1_1   Northern Ireland  CL_1_1_GEOGRAPHY  geography  2092957702
 6    NOMIS  NM_1_1  England and Wales  CL_1_1_GEOGRAPHY  geography  2092957703,
 u'item':   agencyid dataset               description    dimension  name  value
 0    NOMIS  NM_1_1           Total claimants  CL_1_1_ITEM  item      1
 1    NOMIS  NM_1_1      Students on vacation  CL_1_1_ITEM  item      2
 2    NOMIS  NM_1_1       Temporarily stopped  CL_1_1_ITEM  item      3
 3    NOMIS  NM_1_1  Claimants under 18 years  CL_1_1_ITEM  item      4
 4    NOMIS  NM_1_1           Married females  CL_1_1_ITEM  item      9,
 u'measures':   agencyid dataset description        dimension      name  value
 0    NOMIS  NM_1_1   claimants  CL_1_1_MEASURES  measures  20100
 1    NOMIS  NM_1_1   workforce  CL_1_1_MEASURES  measures  20201
 2    NOMIS  NM_1_1      active  CL_1_1_MEASURES  measures  20202
 3    NOMIS  NM_1_1   residence  CL_1_1_MEASURES  measures  20203,
 u'sex':   agencyid dataset description   dimension name  value
 0    NOMIS  NM_1_1        Male  CL_1_1_SEX  sex      5
 1    NOMIS  NM_1_1      Female  CL_1_1_SEX  sex      6
 2    NOMIS  NM_1_1       Total  CL_1_1_SEX  sex      7}

It's easy enough to view the levels for a particular dimension in its own table.

In [467]:
p['geography']
Out[467]:
agencyid dataset description dimension name value
0 NOMIS NM_1_1 United Kingdom CL_1_1_GEOGRAPHY geography 2092957697
1 NOMIS NM_1_1 Great Britain CL_1_1_GEOGRAPHY geography 2092957698
2 NOMIS NM_1_1 England CL_1_1_GEOGRAPHY geography 2092957699
3 NOMIS NM_1_1 Wales CL_1_1_GEOGRAPHY geography 2092957700
4 NOMIS NM_1_1 Scotland CL_1_1_GEOGRAPHY geography 2092957701
5 NOMIS NM_1_1 Northern Ireland CL_1_1_GEOGRAPHY geography 2092957702
6 NOMIS NM_1_1 England and Wales CL_1_1_GEOGRAPHY geography 2092957703

We can also describe one, several, or all of the metadata elements associated with a dataset.

In [468]:
nomis.nomis_code_metadata(describe='all')
The following dimensions are available for NM_1_1 (claimant count with rates and proportions):

 - measures: claimants (20100), workforce (20201), active (20202), residence (20203)
 - sex: Male (5), Female (6), Total (7)
 - item: Total claimants (1), Students on vacation (2), Temporarily stopped (3), Claimants under 18 years (4), Married females (9)
 - freq: Monthly (M), Quarterly (Q), Half-yearly, semester (S), Annually (A)
 - geography: United Kingdom (2092957697), Great Britain (2092957698), England (2092957699), Wales (2092957700), Scotland (2092957701), Northern Ireland (2092957702), England and Wales (2092957703)
In [469]:
nomis.nomis_code_metadata(describe='geography')
The following dimensions are available for NM_1_1 (claimant count with rates and proportions):

 - geography: United Kingdom (2092957697), Great Britain (2092957698), England (2092957699), Wales (2092957700), Scotland (2092957701), Northern Ireland (2092957702), England and Wales (2092957703)
In [470]:
nomis.nomis_code_metadata(describe=['geography','freq'])
The following dimensions are available for NM_1_1 (claimant count with rates and proportions):

 - geography: United Kingdom (2092957697), Great Britain (2092957698), England (2092957699), Wales (2092957700), Scotland (2092957701), Northern Ireland (2092957702), England and Wales (2092957703)
 - freq: Monthly (M), Quarterly (Q), Half-yearly, semester (S), Annually (A)

We can also pull down example tables (or actual tables if we pass in a dataset id) for particular dimensions.

For example:

  • nomis.nomis_codes_age_dur()
  • nomis.nomis_codes_occupation()
  • nomis.nomis_codes_ethnicity()
  • nomis.nomis_codes_geog()
  • nomis.nomis_codes_items()
  • nomis.nomis_codes_measures()
  • nomis.nomis_codes_age()
  • nomis.nomis_codes_sex()
  • nomis.nomis_codes_duration()
  • nomis.nomis_codes_measures()
  • nomis.nomis_codes_time()
  • nomis.nomis_codes_freq()
In [471]:
nomis.nomis_codes_geog()
Out[471]:
agencyid dataset description dimension name value
0 NOMIS NM_1_1 United Kingdom CL_1_1_GEOGRAPHY geography 2092957697
1 NOMIS NM_1_1 Great Britain CL_1_1_GEOGRAPHY geography 2092957698
2 NOMIS NM_1_1 England CL_1_1_GEOGRAPHY geography 2092957699
3 NOMIS NM_1_1 Wales CL_1_1_GEOGRAPHY geography 2092957700
4 NOMIS NM_1_1 Scotland CL_1_1_GEOGRAPHY geography 2092957701
5 NOMIS NM_1_1 Northern Ireland CL_1_1_GEOGRAPHY geography 2092957702
6 NOMIS NM_1_1 England and Wales CL_1_1_GEOGRAPHY geography 2092957703

The geography element also allows us to identify the geographies contained within a particular geography.

In [472]:
nomis.nomis_codes_geog(geography='2092957700').head()
Out[472]:
agencyid dataset description dimension name value
0 NOMIS NM_1_1 Wales CL_1_1_GEOGRAPHY geography 2092957700
1 NOMIS NM_1_1 1991 frozen wards within Wales CL_1_1_GEOGRAPHY geography 2092957700TYPE1
2 NOMIS NM_1_1 parliamentary constituencies 1983 revision wit... CL_1_1_GEOGRAPHY geography 2092957700TYPE8
3 NOMIS NM_1_1 tecs / lecs as of 1989 within Wales CL_1_1_GEOGRAPHY geography 2092957700TYPE18
4 NOMIS NM_1_1 1981 frozen wards within Wales CL_1_1_GEOGRAPHY geography 2092957700TYPE33

We can force a search of the nomis API in a non-cacheing way by calling nomis_codes_datasets(). Calling without an argument returns a table identifying all the datasets:

In [473]:
datasets=nomis.nomis_codes_datasets()
datasets.head(5)
Out[473]:
agency description idx name
0 NOMIS JSA claimant count records the number of peopl... NM_1_1 claimant count with rates and proportions
1 NOMIS A quartery count of claimants who were claimin... NM_2_1 claimant count - age and duration
2 NOMIS A monthly count of job seekers allowance (JSA)... NM_4_1 claimant count - age and duration
3 NOMIS A midyear estimate of the workforce (the denom... NM_5_1 claimant count denominators - historical workf...
4 NOMIS A quarterly count of job seekers allowance cl... NM_6_1 claimant count - occupation

We can also search the nomis API directly for keywords or keyphrases contained with the name and description columns.

In [474]:
datasets_dim=nomis.nomis_codes_datasets(search='claimant count with rates and proportions',dimensions=True)
datasets_dim.head(5)
Out[474]:
agency concept description dimension idx name
0 NOMIS GEOGRAPHY JSA claimant count records the number of peopl... CL_1_1_GEOGRAPHY NM_1_1 claimant count with rates and proportions
1 NOMIS SEX JSA claimant count records the number of peopl... CL_1_1_SEX NM_1_1 claimant count with rates and proportions
2 NOMIS ITEM JSA claimant count records the number of peopl... CL_1_1_ITEM NM_1_1 claimant count with rates and proportions
3 NOMIS MEASURES JSA claimant count records the number of peopl... CL_1_1_MEASURES NM_1_1 claimant count with rates and proportions
4 NOMIS FREQ JSA claimant count records the number of peopl... CL_1_1_FREQ NM_1_1 claimant count with rates and proportions

Wild-card operators are also possible in the search:

In [475]:
nomis.nomis_codes_datasets(search='*seasonally adjusted*')
Out[475]:
agency description idx name
0 NOMIS The seasonally adjusted series takes into acco... NM_11_1 claimant count - seasonally adjusted
1 NOMIS an analysis of seasonally adjusted jobcentre i... NM_19_1 vacancies - seasonally adjusted series
2 NOMIS this data set provides quarterly estimates of ... NM_26_1 employee job estimates - seasonally adjusted
3 NOMIS NM_39_1 claimant flows - seasonally adjusted
4 NOMIS The labour force survey (LFS) is a quarterly s... NM_87_1 labour force survey - quarterly: four quarter ...
5 NOMIS This dataset provides quarterly estimates of w... NM_130_1 workforce jobs by industry (SIC 2007) - season...
6 NOMIS This dataset provides quarterly estimates of w... NM_131_1 workforce jobs by industry (SIC 2007) and sex ...

Note that as some point nomis_codes_datasets() is likely to be pushed further inside the class and dataset_lookup will be come the preferred way of inspecting this information.

Grabbing Data

As well as grabbing information and metadata about datasets from the nomis API, we can also retrieve items from those datasets.

_nomis_data( idx, postcode, areacode, **kwargs )

We can actually grab a dataset using nomis._nomis_data(idx, ...). Depending on the dataset selected, different dimension arguments are possible (as identified using nomis.help_url(idx), for example).

In [476]:
testdata=nomis._nomis_data(geography='2038431803',sex='5,6,7',item=1,measures=20100)
testdata.head()
Out[476]:
GEOGRAPHY_CODE GEOGRAPHY_NAME MEASURES_NAME MEASURES DATE_CODE DATE_NAME SEX_NAME ITEM_NAME OBS_VALUE _Code
0 E06000046 Isle of Wight Persons claiming JSA 20100 2015-01 January 2015 Male Total claimants 1386 NM_1_1
1 E06000046 Isle of Wight Persons claiming JSA 20100 2015-01 January 2015 Female Total claimants 686 NM_1_1
2 E06000046 Isle of Wight Persons claiming JSA 20100 2015-01 January 2015 Total Total claimants 2072 NM_1_1

There's a postcode helper available for identifying a geography from a postcode. By default, we use district as the geography to make lookups into. (See the code for other alternatives.)

In [477]:
nomis._nomis_data(postcode='mk7 6AA',sex='5,6,7',item=1,measures=20100).head()
Out[477]:
GEOGRAPHY_CODE GEOGRAPHY_NAME MEASURES_NAME MEASURES DATE_CODE DATE_NAME SEX_NAME ITEM_NAME OBS_VALUE _Code
0 E06000042 Milton Keynes Persons claiming JSA 20100 2015-01 January 2015 Male Total claimants 1773 NM_1_1
1 E06000042 Milton Keynes Persons claiming JSA 20100 2015-01 January 2015 Female Total claimants 1044 NM_1_1
2 E06000042 Milton Keynes Persons claiming JSA 20100 2015-01 January 2015 Total Total claimants 2817 NM_1_1

As well as passing dimensions and their associated values into the **kwargs, we can also pass in a select parameter that identifies which columns to return from the nomis API.

_nomis_data_url( idx, **kwargs )

We can inspect the URL that gets generated from a particular set of parameters.

In [478]:
nomis._nomis_data_url(idx='NM_31_1',geography='1946157281',measures=20100)
Out[478]:
'https://www.nomisweb.co.uk/api/v01/dataset/NM_31_1.data.csv?measures=20100&time=latest&select=geography_code%2Cgeography_name%2Cmeasures_name%2Cmeasures%2Cdate_code%2Cdate_name%2Cobs_value&geography=1946157281'

Automatic Conversion of Dimension Parameter Values to Dimension Parameter Codes

A helper function is provided to convert dimension values to dimension codes.

In [479]:
testdata=nomis._nomis_data(geography='2038431803',sex='5,Female',item=1,measures=20100)
testdata.head()
Out[479]:
GEOGRAPHY_CODE GEOGRAPHY_NAME MEASURES_NAME MEASURES DATE_CODE DATE_NAME SEX_NAME ITEM_NAME OBS_VALUE _Code
0 E06000046 Isle of Wight Persons claiming JSA 20100 2015-01 January 2015 Male Total claimants 1386 NM_1_1
1 E06000046 Isle of Wight Persons claiming JSA 20100 2015-01 January 2015 Female Total claimants 686 NM_1_1

Conversions are based on look-ups into the metadata for the dataset.

In [480]:
testdata=nomis._nomis_data(geography='2038431803',sex='5,Female',item='Total Claimants',measures=20100)
testdata.head()
Out[480]:
GEOGRAPHY_CODE GEOGRAPHY_NAME MEASURES_NAME MEASURES DATE_CODE DATE_NAME SEX_NAME ITEM_NAME OBS_VALUE _Code
0 E06000046 Isle of Wight Persons claiming JSA 20100 2015-01 January 2015 Male Total claimants 1386 NM_1_1
1 E06000046 Isle of Wight Persons claiming JSA 20100 2015-01 January 2015 Female Total claimants 686 NM_1_1

You can identify the dimension values or codes by using the nomis.nomis_code_metadata(idx, describe='all'|DIMENSION|DIMENSIONLIST) lookup.

Help With Finding Geography Codes

The following tools are provided in addition to the automatic dicovery of a geography code from a postcode lookup.

In [481]:
nomis.get_geo_code()
Out[481]:
description geog
0 United Kingdom 2092957697
1 Great Britain 2092957698
2 England 2092957699
3 Wales 2092957700
4 Scotland 2092957701
5 Northern Ireland 2092957702
6 England and Wales 2092957703
In [482]:
nomis.get_geo_code(search='land')
Out[482]:
description geog
0 England 2092957699
1 Scotland 2092957701
2 Northern Ireland 2092957702
3 England and Wales 2092957703
In [483]:
nomis.get_geo_code(value='1946157281')
Out[483]:
description geog
0 Isle of Wight 1946157281
1 2011 census frozen wards within Isle of Wight 1946157281TYPE236
2 2011 super output areas - middle layer within ... 1946157281TYPE297
3 2011 super output areas - lower layer within I... 1946157281TYPE298
4 super output areas - lower layer within Isle o... 1946157281TYPE304
5 super output areas - middle layer within Isle ... 1946157281TYPE305
6 2003 CAS wards within Isle of Wight 1946157281TYPE312
7 2009 statistical wards within Isle of Wight 1946157281TYPE337
8 2013 electoral ward within Isle of Wight 1946157281TYPE401
9 pre-2009 local authorities: district / unitary... 1946157281TYPE486
In [484]:
nomis.get_geo_code(value='2038431803')
Out[484]:
description geog
0 Isle of Wight 2038431803
1 1991 frozen wards within Isle of Wight 2038431803TYPE1
2 1981 frozen wards within Isle of Wight 2038431803TYPE33
3 super output areas - lower layer within Isle o... 2038431803TYPE304
4 super output areas - middle layer within Isle ... 2038431803TYPE305
5 2003 CAS wards within Isle of Wight 2038431803TYPE312
6 local authorities: district / unitary within I... 2038431803TYPE464
In [485]:
nomis.get_geo_code(desc='Wales',search='const')
Out[485]:
description geog
0 parliamentary constituencies 1983 revision wit... 2092957700TYPE8
1 parliamentary constituencies 1983 revision wit... 2092957700TYPE45
2 parliamentary constituencies 2010 within Wales 2092957700TYPE460
3 national assembly for wales constituencies wit... 2092957700TYPE466
4 parliamentary constituencies 2005 revision wit... 2092957700TYPE468
5 parliamentary constituencies 1995 revision wit... 2092957700TYPE484
In [486]:
nomis.get_geo_code(helper='LA_district',search='hampton')
Out[486]:
description geog
0 East Northamptonshire 1946157157
1 Northampton 1946157159
2 South Northamptonshire 1946157160
3 Wolverhampton 1946157192
4 Southampton 1946157287
In [487]:
nomis.get_geo_code(helper='LA_district',search='Isle of Wight')
Out[487]:
description geog
0 Isle of Wight 1946157281
In [488]:
nomis.get_geo_code(helper='LA_district',search='hampton',chase=True)
Out[488]:
description geog
0 East Northamptonshire 1946157157
1 2011 census frozen wards within East Northampt... 1946157157TYPE236
2 2011 super output areas - middle layer within ... 1946157157TYPE297
3 2011 super output areas - lower layer within E... 1946157157TYPE298
4 super output areas - lower layer within East N... 1946157157TYPE304
5 super output areas - middle layer within East ... 1946157157TYPE305
6 2003 CAS wards within East Northamptonshire 1946157157TYPE312
7 2009 statistical wards within East Northampton... 1946157157TYPE337
8 2013 electoral ward within East Northamptonshire 1946157157TYPE401
9 pre-2009 local authorities: district / unitary... 1946157157TYPE486
10 Northampton 1946157159
11 2011 census frozen wards within Northampton 1946157159TYPE236
12 2011 super output areas - middle layer within ... 1946157159TYPE297
13 2011 super output areas - lower layer within N... 1946157159TYPE298
14 super output areas - lower layer within Northa... 1946157159TYPE304
15 super output areas - middle layer within North... 1946157159TYPE305
16 2003 CAS wards within Northampton 1946157159TYPE312
17 2009 statistical wards within Northampton 1946157159TYPE337
18 2013 electoral ward within Northampton 1946157159TYPE401
19 pre-2009 local authorities: district / unitary... 1946157159TYPE486
20 South Northamptonshire 1946157160
21 2011 census frozen wards within South Northamp... 1946157160TYPE236
22 2011 super output areas - middle layer within ... 1946157160TYPE297
23 2011 super output areas - lower layer within S... 1946157160TYPE298
24 super output areas - lower layer within South ... 1946157160TYPE304
25 super output areas - middle layer within South... 1946157160TYPE305
26 2003 CAS wards within South Northamptonshire 1946157160TYPE312
27 2009 statistical wards within South Northampto... 1946157160TYPE337
28 2013 electoral ward within South Northamptonshire 1946157160TYPE401
29 pre-2009 local authorities: district / unitary... 1946157160TYPE486
30 Wolverhampton 1946157192
31 2011 census frozen wards within Wolverhampton 1946157192TYPE236
32 2011 super output areas - middle layer within ... 1946157192TYPE297
33 2011 super output areas - lower layer within W... 1946157192TYPE298
34 super output areas - lower layer within Wolver... 1946157192TYPE304
35 super output areas - middle layer within Wolve... 1946157192TYPE305
36 2003 CAS wards within Wolverhampton 1946157192TYPE312
37 2009 statistical wards within Wolverhampton 1946157192TYPE337
38 2013 electoral ward within Wolverhampton 1946157192TYPE401
39 pre-2009 local authorities: district / unitary... 1946157192TYPE486
40 Southampton 1946157287
41 2011 census frozen wards within Southampton 1946157287TYPE236
42 2011 super output areas - middle layer within ... 1946157287TYPE297
43 2011 super output areas - lower layer within S... 1946157287TYPE298
44 super output areas - lower layer within Southa... 1946157287TYPE304
45 super output areas - middle layer within South... 1946157287TYPE305
46 2003 CAS wards within Southampton 1946157287TYPE312
47 2009 statistical wards within Southampton 1946157287TYPE337
48 2013 electoral ward within Southampton 1946157287TYPE401
49 pre-2009 local authorities: district / unitary... 1946157287TYPE486

Further Scrappy Notes...

In [489]:
## Exploring the Isle of Wight JSA Figures

Explore a local authority profile
https://www.nomisweb.co.uk/reports/lmp/la/1946157281/report.aspx
  File "<ipython-input-489-89ce433f9290>", line 3
    Explore a local authority profile
            ^
SyntaxError: invalid syntax
In [ ]:
nomis.get_geo_code(value='1946157281')
In [ ]:
Steps

Look up local authority profile:  nomis.get_geo_code(helper='LA_district',search='Isle of Wight')
In [ ]:
Need to identify useful datasets - so for example, JSA by age, duration with proportions  NM_18_1

http://www.nomisweb.co.uk/api/v01/dataset/NM_18_1.data.csv?
    geography=1946157281,2013265928,2092957698&date=latest&age=0&duration=0&sex=7&measures=20100,20206
    &select=date_name,geography_name,geography_code,sex_name,age_name,duration_name,measures_name,obs_value,obs_status_name
    
    
    ??NM_7_1