Get a random work from Trove

Here's a way you can get a random work from Trove's book, article, picture, map, music, or collection zones. It generates random work id prefixes and performs a wildcard search using the id index. If the prefix returns no results, a digit is sliced off the end. If a prefix returns more than 100 results, a digit is added to the end. This continues until the result set hits the sweet spot between 0 and 100.

This method should also work ok with the format facet, however, the further you go down the format hierarchy the smaller the slices, and therefore the harder it will be to match a work id. But certainly you should be able to get random works with specific top-level formats without any drama – for example, a random thesis from the book zone.

This method is probably not going to work for specific collections (ie with a NUC id), or in combination with other search queries. Basically, the more you limit the pool of potential resources, the harder it will be to match on random work ids. I'm working on alternatives for these situations.

In [4]:
import requests
import random
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry

s = requests.Session()
retries = Retry(total=5, backoff_factor=1, status_forcelist=[ 502, 503, 504 ])
s.mount('https://', HTTPAdapter(max_retries=retries))
s.mount('http://', HTTPAdapter(max_retries=retries))
In [5]:
API_KEY = 'YOUR API KEY'
API_URL = 'http://api.trove.nla.gov.au/v2/result'
In [25]:
def get_random_work_from_zone(zone, work_format=None):
    total = 0
    params = {
        'zone': zone,
        'encoding': 'json',
        'n': '100',
        'key': API_KEY
    }
    if work_format:
        params['l-format'] = work_format
    random_id = None
    random_sequence = list(range(0, 10))
    random.shuffle(random_sequence)
    pos = 0
    while total == 0 or total > 100:
        if total == 0 and random_id is None:
            random_id = str(random.randrange(10000, 100000))
        elif total == 0:
            if len(random_id) >= 4:
                random_id = random_id[:-1]
            else:
                random_id = str(random.randrange(10000, 100000))
        if total > 100 and pos < 10:
            random_id = f'{random_id}{random_sequence[pos]}'
            pos += 1
        elif pos == 10:
            random_id = str(random.randrange(10000, 100000))
            pos = 0
        params['q'] = f'id:{random_id}*'
        response = s.get(API_URL, params=params)
        data = response.json()
        total = int(data['response']['zone'][0]['records']['total'])
        # print(total)
        # print(response.url)
    return random.choice(data['response']['zone'][0]['records']['work'])

def get_random_work():
    zone = random.choice(['book', 'article', 'picture', 'map', 'music', 'collection'])
    work = get_random_work_from_zone(zone)
    return work

Get a random work from a random zone

In [19]:
get_random_work()
Out[19]:
{'id': '51635155',
 'url': '/work/51635155',
 'troveUrl': 'https://trove.nla.gov.au/work/51635155',
 'title': 'Prejudice: Its Psychology.(Book Review)(Brief Article)',
 'issued': 1996,
 'type': ['Article/Review', 'Article'],
 'isPartOf': 'CHOICE: Current Reviews for Academic Libraries',
 'holdingsCount': 0,
 'versionCount': 1,
 'relevance': {'score': '6.384695', 'value': 'very relevant'}}

Get a random work from the picture zone

You can specify one of book, article, picture, map, music, or collection. For example:

In [20]:
get_random_work_from_zone('picture')
Out[20]:
{'id': '4058750',
 'url': '/work/4058750',
 'troveUrl': 'https://trove.nla.gov.au/work/4058750',
 'title': 'Peter Blasby',
 'contributor': ['Shorrock, Les'],
 'issued': 1982,
 'type': ['Photograph'],
 'holdingsCount': 1,
 'versionCount': 1,
 'relevance': {'score': '7.7999997', 'value': 'very relevant'},
 'identifier': [{'type': 'url',
   'linktype': 'fulltext',
   'value': 'http://hdl.handle.net/10536/DRO/DU:30016589'},
  {'type': 'url',
   'linktype': 'thumbnail',
   'value': 'http://dro.deakin.edu.au/eserv/DU:30016589/thumbnail_Peter_Blasby.jpg'}]}

Get a random thesis from the book zone

In [26]:
get_random_work_from_zone('book', work_format='Thesis')
Out[26]:
{'id': '6239233',
 'url': '/work/6239233',
 'troveUrl': 'https://trove.nla.gov.au/work/6239233',
 'title': 'A context-dependent approach to tense, mood and aspect in modern Greek / by Cornelia C. Paraskevas-Shepard',
 'contributor': ['Paraskevas-Shepard, Cornelia C'],
 'issued': '1987-1990',
 'type': ['Microform', 'Thesis', 'Book'],
 'holdingsCount': 2,
 'versionCount': 2,
 'relevance': {'score': '6.7510343', 'value': 'very relevant'}}

Speed test

In [14]:
%%timeit
get_random_work()
The slowest run took 4.89 times longer than the fastest. This could mean that an intermediate result is being cached.
359 ms ± 186 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [ ]: