Getting higher resolution versions of photos from the State Library of South Australia online collection interface

The State Library of South Australia makes a fabulous collection of out of copyright photographs available online. However, while you can zoom in using their collection interface to examine the details of many of these images, the download option seems to provide copies at a much lower resolution. This limits their usefulness for many types of research.

This notebook simply takes the tiled versions of the images which are displayed in the collection interface and stitches them together to create higher resolution versions.

For example, the version of this photograph of Clement Wragge provided by the 'Download' button is 1024 x 787 pixels. The version created by this notebook is 5785 × 4337 pixels.

Note that images available for download from the SLSA's digital collections seem to be at a much higher resolution so don't need any special tricks to use.

Setting things up

Run these cells using Shift+Enter to get the code ready to use.

In [67]:
import requests
from PIL import Image
from io import BytesIO
from slugify import slugify
import re
from IPython.display import display, HTML, FileLink
In [68]:
def get_json(url):
    '''
    Get the JSON file that includes information about the zoom levels and tiles.
    '''
    json_url = '{}/{}'.format(url.rstrip('/'), 'tiles.json')
    response = requests.get(json_url)
    data = response.json()
    return data

def get_highest_level(data):
    '''
    Find the highest level of zoom -- ie the biggest version of the image -- in the JSON data.
    '''
    for level in data['levels']:
        if level['name'] == 'z0':
            highest_zoom = level
            break
    return highest_zoom

def download_image(url):
    '''
    Provide a url of a digitised photos, and get back the largest possible version for download.
    Gets information about available zoom levels and tiles, then stitches the tiles together.
    '''
    # Get data about levels
    data = get_json(url)
    # Get the highest zoom level
    level = get_highest_level(data)
    # Dimensions of the biggest image
    w = level['width']
    h = level['height']
    # Create an empty image to paste the tiles into
    img = Image.new('RGB', (w, h))
    # Loop through all the tiles
    for index, tile in enumerate(level['tiles']):
        # Get a tile and open as an image
        response = requests.get(tile['url'])
        tile_img = Image.open(BytesIO(response.content))
        # When we've got the first tile, grab the height and width
        if index == 0:
            tile_w, tile_h = tile_img.size
        # The tile data includes an x and y index value indicating the position of the tile
        # To calculate it's coordinates, just multiply the index by the width/height
        x = tile['x'] * tile_w
        y = tile['y'] * tile_h
        # Paste the tile into the big image using the x/y coords to define the top left corner
        img.paste(tile_img, box=(x, y))
        id = re.search(r'resource\/(.*)', url).group(1)
    # Create file name that includes the image ID info
    image_name = 'slsa-{}.jpg'.format(slugify(id))
    # Save and display the image
    img.save(image_name)
    display(FileLink(image_name))
    display(HTML('<img src="{}">'.format(image_name)))

Supply the URL of the photo

Just paste the url of the photo you want to download between the quotes in the cell below and run the cell using Shift+Enter. Once it has been created, the final image will be displayed below with a link for easy download.

In [69]:
download_image('https://collections.slsa.sa.gov.au/resource/B+43122')
In [ ]: