Get a random work from Trove

Here's a way you can get a random work from Trove's book, article, picture, map, music, or collection zones. It generates random work id prefixes and performs a wildcard search using the id index. If the prefix returns no results, a digit is sliced off the end. If a prefix returns more than 100 results, a digit is added to the end. This continues until the result set hits the sweet spot between 0 and 100.

This method should also work ok with the format facet, however, the further you go down the format hierarchy the smaller the slices, and therefore the harder it will be to match a work id. But certainly you should be able to get random works with specific top-level formats without any drama – for example, a random thesis from the book zone.

This method is probably not going to work for specific collections (ie with a NUC id), or in combination with other search queries. Basically, the more you limit the pool of potential resources, the harder it will be to match on random work ids. In that case you might want to try using facets.

In [1]:
import requests
import random
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry

s = requests.Session()
retries = Retry(total=5, backoff_factor=1, status_forcelist=[ 502, 503, 504 ])
s.mount('https://', HTTPAdapter(max_retries=retries))
s.mount('http://', HTTPAdapter(max_retries=retries))
In [2]:
API_KEY = 'YOUR API KEY'
API_URL = 'http://api.trove.nla.gov.au/v2/result'
In [3]:
def get_random_work_from_zone(zone, work_format=None):
    total = 0
    params = {
        'zone': zone,
        'encoding': 'json',
        'n': '100',
        'key': API_KEY
    }
    if work_format:
        params['l-format'] = work_format
    random_id = None
    random_sequence = list(range(0, 10))
    random.shuffle(random_sequence)
    pos = 0
    while total == 0 or total > 100:
        if total == 0 and random_id is None:
            random_id = str(random.randrange(10000, 100000))
        elif total == 0:
            if len(random_id) >= 4:
                random_id = random_id[:-1]
            else:
                random_id = str(random.randrange(10000, 100000))
        if total > 100 and pos < 10:
            random_id = f'{random_id}{random_sequence[pos]}'
            pos += 1
        elif pos == 10:
            random_id = str(random.randrange(10000, 100000))
            pos = 0
        params['q'] = f'id:{random_id}*'
        response = s.get(API_URL, params=params)
        data = response.json()
        total = int(data['response']['zone'][0]['records']['total'])
        # print(total)
        # print(response.url)
    return random.choice(data['response']['zone'][0]['records']['work'])

def get_random_work():
    zone = random.choice(['book', 'article', 'picture', 'map', 'music', 'collection'])
    work = get_random_work_from_zone(zone)
    return work

Get a random work from a random zone

In [4]:
get_random_work()
Out[4]:
{'id': '421149',
 'url': '/work/421149',
 'troveUrl': 'https://trove.nla.gov.au/work/421149',
 'title': 'The history of the 67th regiment Indiana infantry volunteers : war of the rebellion',
 'contributor': ['Scott, Reuben B. comp'],
 'issued': 1892,
 'type': ['Book'],
 'holdingsCount': 0,
 'versionCount': 1,
 'relevance': {'score': '6.0044403', 'value': 'very relevant'},
 'identifier': [{'type': 'url',
   'linktype': 'restricted',
   'linktext': 'Direct link to full text: http://openlibrary.org/details/historyof67threg00scot',
   'value': 'http://openlibrary.org/books/OL178491M'}]}

Get a random work from the picture zone

You can specify one of book, article, picture, map, music, or collection. For example:

In [5]:
get_random_work_from_zone('picture')
Out[5]:
{'id': '7245034',
 'url': '/work/7245034',
 'troveUrl': 'https://trove.nla.gov.au/work/7245034',
 'title': "The pilgrim's progress from this world to that which is to come. The second part : delivered under the similitude of a dream : wherein is set forth the manner of the setting out of Christian's wife and children, their dangerous journey, and safe arrival at the desired country / John Bunyan",
 'contributor': ['Bunyan, John, 1628-1688'],
 'issued': '1684-1986',
 'type': ['Book', 'Book/Illustrated', 'Photograph', 'Microform'],
 'holdingsCount': 4,
 'versionCount': 4,
 'relevance': {'score': '7.799999E-6', 'value': 'vaguely relevant'},
 'identifier': [{'type': 'url',
   'linktype': 'restricted',
   'value': 'http://gateway.proquest.com/openurl?ctx_ver=Z39.88-2003&res_id=xri:eebo&rft_val_fmt=&rft_id=xri:eebo:image:108572'},
  {'type': 'url',
   'linktype': 'thumbnail',
   'value': 'https://repository.monash.edu/files/thumbnails/e33d20c3f7436f29a420eed4a072f600.jpg'}]}

Get a random thesis from the book zone

In [6]:
get_random_work_from_zone('book', work_format='Thesis')
Out[6]:
{'id': '5874780',
 'url': '/work/5874780',
 'troveUrl': 'https://trove.nla.gov.au/work/5874780',
 'title': 'A study of certain components contributing to knowledge and understanding in physical education of fifth grade boys / Barry W. Clark',
 'contributor': ['Clark, Barry W'],
 'issued': 1975,
 'type': ['Thesis'],
 'holdingsCount': 0,
 'versionCount': 1,
 'relevance': {'score': '6.0044403', 'value': 'very relevant'}}

Speed test

In [7]:
%%timeit
get_random_work()
The slowest run took 4.35 times longer than the fastest. This could mean that an intermediate result is being cached.
537 ms ± 223 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Created by Tim Sherratt for the GLAM Workbench.