The State Library of South Australia makes a fabulous collection of out of copyright photographs available online. However, while you can zoom in using their collection interface to examine the details of many of these images, the download option seems to provide copies at a much lower resolution. This limits their usefulness for many types of research.
This notebook simply takes the tiled versions of the images which are displayed in the collection interface and stitches them together to create higher resolution versions.
For example, the version of this photograph of Clement Wragge provided by the 'Download' button is 1024 x 787 pixels. The version created by this notebook is 5785 × 4337 pixels.
Note that images available for download from the SLSA's digital collections seem to be at a much higher resolution so don't need any special tricks to use.
Run these cells using Shift+Enter to get the code ready to use.
import requests
from PIL import Image
from io import BytesIO
from slugify import slugify
import re
from IPython.display import display, HTML, FileLink
def get_json(url):
'''
Get the JSON file that includes information about the zoom levels and tiles.
'''
json_url = '{}/{}'.format(url.rstrip('/'), 'tiles.json')
response = requests.get(json_url)
data = response.json()
return data
def get_highest_level(data):
'''
Find the highest level of zoom -- ie the biggest version of the image -- in the JSON data.
'''
for level in data['levels']:
if level['name'] == 'z0':
highest_zoom = level
break
return highest_zoom
def download_image(url):
'''
Provide a url of a digitised photos, and get back the largest possible version for download.
Gets information about available zoom levels and tiles, then stitches the tiles together.
'''
# Get data about levels
data = get_json(url)
# Get the highest zoom level
level = get_highest_level(data)
# Dimensions of the biggest image
w = level['width']
h = level['height']
# Create an empty image to paste the tiles into
img = Image.new('RGB', (w, h))
# Loop through all the tiles
for index, tile in enumerate(level['tiles']):
# Get a tile and open as an image
response = requests.get(tile['url'])
tile_img = Image.open(BytesIO(response.content))
# When we've got the first tile, grab the height and width
if index == 0:
tile_w, tile_h = tile_img.size
# The tile data includes an x and y index value indicating the position of the tile
# To calculate it's coordinates, just multiply the index by the width/height
x = tile['x'] * tile_w
y = tile['y'] * tile_h
# Paste the tile into the big image using the x/y coords to define the top left corner
img.paste(tile_img, box=(x, y))
id = re.search(r'resource\/(.*)', url).group(1)
# Create file name that includes the image ID info
image_name = 'slsa-{}.jpg'.format(slugify(id))
# Save and display the image
img.save(image_name)
display(FileLink(image_name))
display(HTML('<img src="{}">'.format(image_name)))
Just paste the url of the photo you want to download between the quotes in the cell below and run the cell using Shift+Enter. Once it has been created, the final image will be displayed below with a link for easy download.
download_image('https://collections.slsa.sa.gov.au/resource/B+43122')