Notebook

Downloading SmugMug Captions with Python and Jupyter¶

Prerequistes¶

This notebook assumes you have set up your environment to use smugpyter.py. Refer to this notebook for details on how to do this.

Getting Ready to use the SmugMug API with Python and Jupyter

Why am I doing this?¶

My photo captions have evolved into a form of milliblogging. Milliposts (milliblog posts) are terse and tiny; many are single sentences or paragraphs. Taken one-at-a-time milliposts seldom impress but when gathered in hundreds or thousands accidental epics emerge. So, to prevent "epic loss" I want a simple way of downloading and archiving my captions off-line.

If you don't control it you cannot trust it!¶

When I started blogging I knew that you could not depend on blogging websites to archive and preserve your documents. We had already seen cases of websites mangling content, shutting down without warning, and even worse, censoring bloggers. It was a classic case of, “If you don't control it you cannot trust it." I resolved to maintain complete off-line version controlled copies of my blog posts.

Maintaining off-line copies was made easier by WordPress.com's excellent blog export utility. A simple button push downloads a large XML file that contains all your blog posts with embedded references to images and other inclusions. XML is not my preferred archive format. I am a huge fan of LaTeX and Markdown: two text formats that are directly supported in Jupyter Notebooks. I wrote a little system that parses the WordPress XML file and generates LaTeX and Markdown files. Yet, despite milliblogging long before blogging, I don't have a similar system for downloading and archiving Smugmug metadata. This Jupyter notebook addresses this omission and shows how you can use Python and the Smugmug API to extract gallery and image metadata and store it in version controlled local directories as CSV files.

smugpyter.py runs in the Python 3.6 to 3.9/Jupyter/Win64 environment.¶

A lot of the code in this notebook was derived from:

The originals did not run in the Python 3.6/Jupyter/Win64 environment and lacked some of the facilities I wanted so I adjusted, tweaked and modified the scripts. The result is incompatible with the originals so I renamed the main class SmugPyter to avoid confusion. Finally, being new to Python and Jupyter, I used the 2to3 tool to help make the changes.

In [ ]:

import os
import sys
import requests
sys.path.append(r'C:\mp\jupyter\smugpyter\notebooks')
import smugpyter

help(smugpyter)

Create the SmugPyter configuration file.¶

The SmugPyter class constuctor reads a config file. If this file is missing you cannot create instances of the SmugMug class or connect to your SmugMug account.

In [ ]:

# the SmugPyter class constuctor reads a config file in this location.
os.path.join(os.path.expanduser("~"), '.smugpyter.cfg')

The following prompts for your SmugMug API keys. You can apply for SmugMug keys on your SmugMug account by browsing to the API KEYS section of your account settings.

# code from https://github.com/speedenator/smuploader/blob/master/bin/smregister # modified for python 3.6/jupyter environment - modifications assisted by 2to3 tool # NOTE: this code cell has been turned into Raw NBConvert to prevent accidental execution # if you need to run this code turn this cell back into code. from rauth.service import OAuth1Service import requests import http.client import httplib2 import hashlib import urllib.request, urllib.parse, urllib.error import time import sys import os import json import configparser import re import shutil # depends on previously run cells #from smuploader import SmugMug def write_config(configfile, params): config = configparser.ConfigParser() config.add_section('SMUGMUG') for key, value in params: config.set('SMUGMUG', key, value) with open(SmugMug.smugmug_config, 'w') as f: config.write(f) if __name__ == '__main__': print("\n\n\n#######################################################") print("## Welcome! ") print("## We are going to go through some steps to set up this SmugMug photo manager and make it connect to the API.") print("## Step 0: What is your SmugMug username?") username = input("Username: ") print('## Step 1: Enter your local directory, e.g. c:/SmugMirror/') localdir = input("Directory: ") print("## Step 2: Go to https://api.smugmug.com/api/developer/apply and apply for an API key.") print("## This gives you unique identifiers for connecting to SmugMug.") print("## When done, you can find the API keys in your SmugMug profile.") print("## Account Settings -> Me -> API Keys") print(("## Enter them here and they will be saved to the config file (" + SmugMug.smugmug_config + ") for later use.")) consumer_key = input("Key: ") consumer_secret = input("Secret: ") write_config(SmugMug.smugmug_config, [("username", username), ("consumer_key", consumer_key), ("consumer_secret", consumer_secret), ("access_token", ''), ("access_token_secret", '')]) smugmug = SmugMug() authorize_url = smugmug.get_authorize_url() print(("## Step 2: Visit this address in your browser to authenticate your new keys for access your SmugMug account: \n## " + authorize_url)) print("## After that, enter the 6-digit key that SmugMug provided") verifier = input("6-digit key: ") access_token, access_token_secret = smugmug.get_access_token(verifier) write_config(SmugMug.smugmug_config, [("username", username), ("consumer_key", consumer_key), ("consumer_secret", consumer_secret), ("access_token", access_token), ("access_token_secret", access_token_secret)]) print("## Great! All done!")

Try out the SmugPyter class with credentials saved in the previous cell.¶

In [ ]:

smugmug = smugpyter.SmugPyter()
len(smugmug.get_album_names())

In [ ]:

help(smugmug)

In [ ]:

smugmug.get_albums()

In [ ]:

smugmug.get_folders()

In [ ]:

caught_my_eye = smugmug.get_album_id('Caught My Eye')
forebearers = smugmug.get_album_id('Great and Greater Forebearers')
idaho_instants = smugmug.get_album_id("Idaho Instants")
cell_phoning = smugmug.get_album_id("Cell Phoning It In")
[caught_my_eye, forebearers, idaho_instants, cell_phoning]

In [ ]:

smugmug.get_album_info(caught_my_eye)

In [ ]:

album_images = smugmug.get_album_images(forebearers)
len(album_images)

In [ ]:

album_captions = smugmug.get_album_image_captions(album_images)
album_latitude_longitude = smugmug.get_latitude_longitude_altitude(album_images)
album_latitude_longitude

In [ ]:

album_real_dates = smugmug.get_album_image_real_dates(album_images)
album_real_dates

Try out other "unsupported" version 2.0 API calls. Documentation for SmugMug Version 2.0 API calls is best obtained by hacking with SmugMug's live API browser tool at:

https://api.smugmug.com/api/v2

The live API tool is far more useful if you log into your SmugMug account and point at your own images.

Walk SmugMug folders and albums and download coveted metadata.¶

The next cell calls the main function that walks SmugMug folders and writes metadata to local directories. Metadata is saved in TAB delimited CSV manifest files. TAB delimited files are also called TSV files. The function writes one file per album. If local directories do not exist they are created. If manifest files already exist they are are overwritten. The entire SmugMug tree is walked. You might want to adjust where the walk starts if you have hundreds or thousands of albums.

Manifest files follow this naming convention.

manifest-<deblanked-album-name>-<smugmug-album-key>-<key-case-mask>.txt

Here are some examples:

manifest-ZambiaEclipseTrip-k65QRs-6.txt
manifest-FromHazelsAlbums-FZK4j4-1k.txt

In [ ]:

smugmug = smugpyter.SmugPyter()

smugmug.download_smugmug_mirror(func_album=smugmug.write_album_manifest)

Next on the Agenda!¶

Remember how "no good dead goes unpunished." Well, running code will be "enhanced" whether it's necessary or not. Now that I have a local directories that contain relevant SmugMug metadata in an easily consumed CSV form other notebooks will use these directories to generate what I call "long duration documents." My prefered long duration sources are Markdown and LaTex. Both of these text formats will be readable for centuries if printed on high quality acid free paper and stored in numerous "secure and undisclosed locations."

Remember, always Analyze the Data not the Drivel.

John Baker, Meridian Idaho, January 31, 2022

In [ ]: