Map Trove newspaper results by state

Version 2 of the Trove API adds the state facet to the newspapers zone. This means we can easily get the number of articles from a search query published in each state.

If you haven't used one of these notebooks before, they're basically web pages in which you can write, edit, and run live code. They're meant to encourage experimentation, so don't feel nervous. Just try running a few cells and see what happens!.

Some tips:

  • Code cells have boxes around them.
  • To run a code cell click on the cell and then hit Shift+Enter. The Shift+Enter combo will also move you to the next cell, so it's a quick way to work through the notebook.
  • While a cell is running a * appears in the square brackets next to the cell. Once the cell has finished running the asterix will be replaced with a number.
  • In most cases you'll want to start from the top of notebook and work your way down running each cell in turn. Later cells might depend on the results of earlier ones.
  • To edit a code cell, just click on it and type stuff. Remember to run the cell once you've finished editing.

Add your API key

In [ ]:
# This creates a variable called 'api_key', paste your key between the quotes
# <-- Then click the run icon 
api_key = 'YOUR API KEY'

# This displays a message with your key
print('Your API key is: {}'.format(api_key))

Setting things up

You don't need to edit anything here. Just run the cells to load the bits and pieces we need.

In [4]:
# Import the libraries we need
# <-- Click the run icon 
import requests
import pandas as pd
import os
import altair as alt
import json
In [5]:
# Set up default parameters for our API query
# <-- Click the run icon 
params = {
    'zone': 'newspaper',
    'encoding': 'json',
    'facet': 'state',
    'n': '1',
    'key': api_key
}

api_url = 'http://api.trove.nla.gov.au/v2/result'

This is where you set your search keywords. Change 'weather' in the cell below to anything you might enter in the Trove simple search box. For example:

params['q'] = 'weather AND wragge'

params['q'] = '"Clement Wragge"'

params['q'] = 'text:"White Australia Policy"'

params['q'] = 'weather AND date:[1890-01-01T00:00:00Z TO 1920-12-11T00:00:00Z]'

You can also limit the results to specific categories. To only search for articles, include this line:

params['l-category'] = 'Article'

In [6]:
# Enter your search parameters
# This can be anything you'd enter in the Trove simple search box
params['q'] = 'radio'

# Remove the "#" symbol from the line below to limit the results to the article category
params['l-category'] = 'Article'

Get the data from Trove

Everything's set up, so just run the cells!

Make an API request

In [7]:
# <-- Click the run icon 
response = requests.get(api_url, params=params)
data = response.json()

Reformat the results

In [9]:
# <-- Click the run icon 
def format_facets(data):
    facets = data['response']['zone'][0]['facets']['facet']['term']
    df = pd.DataFrame(facets)
    df = df[['display', 'count']]
    df.columns = ['state', 'total']
    df['total'] = pd.to_numeric(df['total'], errors='coerce')
    df = df.replace('ACT', 'Australian Capital Territory')
    df = df[(df['state'] != 'National') & (df['state'] != 'International')]
    return df
df = format_facets(data)
df
Out[9]:
state total
0 New South Wales 524367
1 Queensland 332178
2 Western Australia 193398
3 Victoria 139507
4 South Australia 130366
5 Tasmania 89391
6 Australian Capital Territory 79916
9 Northern Territory 5876

Make some charts!

Just run the cells!

In [10]:
# Create a bar chart
# <-- Click the run icon 
chart = alt.Chart(df).mark_bar(color='#084081').encode(
    x=alt.X('total', axis=alt.Axis(title='Total articles')),
    y=alt.Y('state', axis=alt.Axis(title='')),
    tooltip=[alt.Tooltip('total', title='Total articles')]
).properties(width=300, height=200)
In [11]:
# Make a chloropleth map
# <-- Click the run icon 
with open('data/aus_state.geojson', "r") as geo_file:
    geo_data = json.load(geo_file)
map = alt.Chart(alt.Data(values=geo_data['features'])
        ).mark_geoshape(stroke='black', strokeWidth=0.2
        ).encode(color=alt.Color('total:Q', scale=alt.Scale(scheme='greenblue'), legend=alt.Legend(title='Total articles'))
        ).transform_lookup(lookup='properties.STATE_NAME', from_=alt.LookupData(df, 'state', ['total'])
        ).project(type='mercator'
        ).properties(width=400, height=400)
In [12]:
# Display the charts side by side
# <-- Click the run icon 
alt.hconcat(map, chart).resolve_legend(
    color="independent"
)
Out[12]: