New to Jupyter notebooks? Try Using Jupyter notebooks for a quick introduction.
In this section we're going to learn how to send a request for information to the Trove API.
API requests are just like normal urls. However, instead of sending us back a web page, they deliver data in a form that computers can understand. We can then use that data in our own programs and pipelines.
We're going to use the Python Requests module to handle our API queries, so let's import it now with a few other things we'll need.
%%capture
import os
# We're going to use the Python Requests module to handle our API queries
import requests
from dotenv import load_dotenv
# We'll use this to display nice;y formatted JSON results
from IPython.display import JSON
load_dotenv()
Any requests you make to the Trove API need to be authenticated with a 'key'. For non-commercial projects, you just fill out a simple form and your API key is generated instantly. Follow the instructions in the Trove Help to obtain your own Trove API Key.
Once you've created a key, you can access it at any time on the 'For developers' tab of your Trove user profile.
Copy your API key now, and paste it in the cell below, between the quotes.
# This creates a variable called 'api_key', paste your key between the quotes
API_KEY = "INSERT YOUR API KEY HERE"
# Leave these lines as they are.
# Use an api key value from environment variables if it is available (useful for testing)
if os.getenv("TROVE_API_KEY"):
API_KEY = os.getenv("TROVE_API_KEY")
# This displays a message with your key
print("Your API key is: {}".format(API_KEY))
Your API key is: gq29l1g1h75pimh4
All search queries to the Trove API start with the same base url. We'll save it as a variable here.
# Create a variable called 'api_search_url' and give it a value
api_search_url = "https://api.trove.nla.gov.au/v3/result"
Trove API queries are constructed by adding parameters to the base url. All of the parameters are optional, except for:
category
– which Trove category (or categories) do you want to search, use all
for everythingYou'll often want to supply a search query:
q
– 'q' for query, this is where search terms go, if you don't supply a q
value you'll get everythingYou might also want to specify the format in which the results are delivered. The default is xml
, but for most applications you'll probably find it easier to work with json
.
encoding
– the format of the results, this can be set to either xml
or json
(xml
is the default.We'll meet some other parameters later, but for now let's create a Python dictionary to store our basic parameters. The requests
library will take this dictionary, turn it into a string, and add it to the base url.
For our first API request we're going to search Trove's digitised newspapers, so we'll assign the value 'newspaper' to the category
parameter. Feel free to edit the q
value to search for something that interests you.
# This creates a dictionary called 'params' and sets values for the API's mandatory parameters
params = {
"q": "cyclone", # Search for this keyword -- feel free to change!
"category": "newspaper", # Search in the newspaper category
"encoding": "json",
}
You supply your API key using headers
. These are extra, hidden parameters that describe your request to the server.
# Add your API key to the request headers
headers = {"X-API-KEY": API_KEY}
Ok, we're now now ready to make our first query!
# This sends our request to the Trove API and stores the result in a variable called 'response'
response = requests.get(api_search_url, params=params, headers=headers)
# This shows us the url that's sent to the API
print(f"API url: {response.url}")
API url: https://api.trove.nla.gov.au/v3/result?q=cyclone&category=newspaper&encoding=json
See how requests
has taken our parameters and turned them into a string with '&' between each one?
The url above is live – try clicking on it to see the raw results from Trove.
The response
variable contains all the data returned to us by the Trove API. Let's get it out in a usable form.
# Get the Trove API's JSON results and make them available as a Python variable called 'data'
data = response.json()
# Display the results as nicely-formatted JSON
JSON(data, expanded=True)
<IPython.core.display.JSON object>
How many results are there?
data["category"][0]["records"]["total"]
658101
As you can see, the API results are fairly complex. Individual item records are quite deeply nested. Let's run a simple script to display the basic details of each of our matching articles.
# Loop through all the newspaper articles
# The articles themselves are quite deeply nested, so we have to go down several levels to get them
for article in data["category"][0]["records"]["article"]:
# Display a string containing the date, title, newspaper, and page for each article
print(
f'{article["date"]}, "{article["heading"]}", {article["title"]["title"]}, page {article["page"]}'
)
You've made your first Trove API request. Now let's move on to learn a bit about Trove's zones.
Created by Tim Sherrratt for the GLAM workbench. Support this project by becoming a GitHub sponsor.