Setting SmugMug Print Size and Geotag Keywords with Jupyter and Python

Prerequistes

This notebook assumes you have set up your environment to use smugpyter.py. Refer to this notebook for details on how to do this.

Getting Ready to use the SmugMug API with Python and Jupyter

Why am I doing this?

Many years ago I wrote a little J verb smuprintsizes that computed the largest standard SmugMug print sizes when given image dimensions and the desired DPI. I used the output of this verb to set aspect ratio keywords for my SmugMug pictures until changes to SmugMug, particularly the introduction of OAuth authentication, broke my little SmugMug API application that called smugprintsizes.

My print size keyword setter broke years ago but many of these keys still show up in my "top hundred" keywords.

10x15 4x5 4x6 5x5 5x6.7 5x7 ...

Print size keywords were very handy. They made it easy to select paper sizes for one or hundreds of pictures. This notebook will use the SmugMug API and Python to compute and set print size keywords.

The Print Sizes Table

smugprintsizes made use of the following table.

 ┌─────┬─────────┬──────────────┐
 │0.7  │17.5 70  │3.5x5 7x10    │
 ├─────┼─────────┼──────────────┤
 │0.8  │20 80    │4x5 8x10      │
 ├─────┼─────────┼──────────────┤
 │0.755│21.2 84.8│4x5.3 8x10.6  │
 ├─────┼─────────┼──────────────┤
 │0.665│24 96    │4x6 8x12      │
 ├─────┼─────────┼──────────────┤
 │0.5  │32 50 128│4x8 5x10 8x16 │
 ├─────┼─────────┼──────────────┤
 │1    │25 64 100│5x5 8x8 10x10 │
 ├─────┼─────────┼──────────────┤
 │0.745│33.5     │5x6.7         │
 ├─────┼─────────┼──────────────┤
 │0.715│35       │5x7           │
 ├─────┼─────────┼──────────────┤
 │0.165│150      │5x30          │
 ├─────┼─────────┼──────────────┤
 │0.4  │160      │8x20          │
 ├─────┼─────────┼──────────────┤
 │0.775│93.5     │8.5x11        │
 ├─────┼─────────┼──────────────┤
 │0.75 │108      │9x12          │
 ├─────┼─────────┼──────────────┤
 │0.77 │130      │10x13         │
 └─────┴─────────┴──────────────┘

The first column is the Short/Long image aspect ratio rounded to 0.005. The middle column lists areas in square inches of the corresponding print sizes in the last column.

This table uses inches but the algorithm doesn't care about units. You can easily use metric values.

Finding the largest DPI dependent print size is simple matter of:

  1. Divide the short image dimension by the long image dimension and round to 0.005. This is the aspect ratio.

  2. Search for an aspect ratio match in the first column. Many images will not match. Quit and return 0z1 for no aspect match. The 0zN codes are similiar to the NxM print sizes codes. This will be important in later notebooks.

  3. If a match is found compute the print area required for a given DPI and round to 0.5.

  4. Find the index of the largest area in the second column that is greater than or equal to the required area computed in the previous step. If there are not enough pixels no area will meet this criterion. Quit and return 0z0 for not enough pixels.

  5. If an area is found select and return the corresponding print size in the last column. Finally, if the DPI area exceeds all areas for an aspect ratio return the largest print size.

An image with dimensions of 2389 x 3344 has enough pixels to make a standard 5x7 inch 360 DPI print. It does not have enough pixels to make a 5x7 inch 720 DPI print.

Print resolution is a hot button issue for photographers. How many dots (DPI) or pixels (PPI) are required depends on many factors, viewing distance, illumination, image colors, paper gloss and so on. Human vision tests have demonstrated that young people with excellent eyesight can tell the difference between 500 DPI and 600 DPI prints. Resolutions beyond 600 DPI are mostly wasted unless you are using loupes or microscopes. According to Dr. Optoglass:

If the average reading distance is 1 foot (12 inches = 305 mm), p @0.4 arc minute is 35.5 microns or about 720 ppi/dpi. p @1 arc minute is 89 microns or about 300 dpi/ppi. This is why magazines are printed at 300 dpi – it’s good enough for most people. Fine art printers aim for 720, and that’s the best it need be. Very few people stick their heads closer than 1 foot away from a painting or photograph.

Digital printers complicate DPI issues by applying sophisticated resizing algorithms that can turn low resolution originals into plausible higher resolution copies. I've found that 360 DPI is a good starting point for SmugMug prints. For exceptional images you can simply divide the 360 DPI image dimensions by two for 720 DPI printing.

Computing DPI Dependent Print Area

The use of the print size table is clear with the exception of computing the print area required for a given DPI. dpi_area computes DPI dependent print area.

In [1]:
def round_to(n, precision):
    correction = 0.5 if n >= 0 else -0.5
    return int( n/precision+correction ) * precision

def aspect_ratio(height, width, *, precision=0.005):
    return round_to( min(height, width) / max(height, width), precision )

def dpi_area(height, width, *, dpi=360, precision=0.5):
    return round_to( (height * width) / dpi ** 2, precision )

# image pixel dimensions - order is immaterial
height, width = 2389 , 3344

print('aspect ratio %s' % aspect_ratio(height, width))
print('area at 360 dpi %s' % dpi_area(height, width))
print('area at 720 dpi %s' % dpi_area(height, width, dpi=720))
aspect ratio 0.715
area at 360 dpi 61.5
area at 720 dpi 15.5

Representing the Print Size table

There are many ways to encode the print size table. I am starting with the simplest possible representation: three lists, one for each column.

The lists must have the same number of items. Eventually, these details will be hidden within a SmugPyter subclass that manages the details of creating and using print size tables. For now let's build the lists from a simple string.

In [2]:
import smugpyter
smugmug = smugpyter.SmugPyter()
In [3]:
# list of all known small to medium SmugMug print sizes
smug_print_sizes = """
 3.5x5  4x5    4x5.3  4x6    4x8    
 5x5    5x6.7  5x7    5x10   5x30   
 7x10   8x8    8x10   8x10.6 8x12   
 8x16   8x20   8.5x11 9x12   10x10  
 10x13  10x15  10x16  10x20  10x30  
 11x14  11x16  11x28  12x12  12x18  
 12x20  12x24  12x30  16x20  16x24  
 18x24  20x20  20x24  20x30 
"""

# clean up the usual suspects
smug_print_sizes = smugmug.purify_smugmug_text(smug_print_sizes).split()
print(smug_print_sizes)
['3.5x5', '4x5', '4x5.3', '4x6', '4x8', '5x5', '5x6.7', '5x7', '5x10', '5x30', '7x10', '8x8', '8x10', '8x10.6', '8x12', '8x16', '8x20', '8.5x11', '9x12', '10x10', '10x13', '10x15', '10x16', '10x20', '10x30', '11x14', '11x16', '11x28', '12x12', '12x18', '12x20', '12x24', '12x30', '16x20', '16x24', '18x24', '20x20', '20x24', '20x30']
In [4]:
all_aspect_ratios = []
all_print_areas = []

for size in smug_print_sizes:
    height , width = size.split('x')
    height = float(height) 
    width = float(width)
    ratio = aspect_ratio(height, width)
    area = height * width
    all_aspect_ratios.append(ratio)
    all_print_areas.append(area)
    
aspect_ratios = list(set(all_aspect_ratios))
print(aspect_ratios)
print(all_print_areas)
[0.7000000000000001, 0.8, 0.755, 0.665, 0.5, 1.0, 0.745, 0.715, 0.165, 0.4, 0.775, 0.75, 0.625, 0.335, 0.6900000000000001, 0.77, 0.395, 0.6, 0.835, 0.785]
[17.5, 20.0, 21.2, 24.0, 32.0, 25.0, 33.5, 35.0, 50.0, 150.0, 70.0, 64.0, 80.0, 84.8, 96.0, 128.0, 160.0, 93.5, 108.0, 100.0, 130.0, 150.0, 160.0, 200.0, 300.0, 154.0, 176.0, 308.0, 144.0, 216.0, 240.0, 288.0, 360.0, 320.0, 384.0, 432.0, 400.0, 480.0, 600.0]
In [5]:
def dualsort(a, b):
    """
    Sort lists (a) and (b) using (a) to grade (b).
    """
    temp = sorted(zip(a, b), key=lambda x: x[0])
    return list(map(list, zip(*temp)))

# group areas and keys by ratios
gpa = []
gsk = []
for ur in aspect_ratios:
    gp = []
    gk = []
    for ar, pa, sk in zip(all_aspect_ratios, all_print_areas, smug_print_sizes):
        if ur == ar:
            gp.append(pa)
            gk.append(sk)
    # insure sublists are sorted by ascending area
    gp , gk = dualsort(gp, gk)
    gpa.append(gp)
    gsk.append(gk)

print_areas = gpa
size_keywords = gsk
In [6]:
#aspect_ratios = [0.7, 0.8, 0.755, 0.665, 0.5, 1, 0.745, 0.715, 
#                 0.165, 0.4, 0.775, 0.75, 0.77]
print(aspect_ratios)
print(len(aspect_ratios))
[0.7000000000000001, 0.8, 0.755, 0.665, 0.5, 1.0, 0.745, 0.715, 0.165, 0.4, 0.775, 0.75, 0.625, 0.335, 0.6900000000000001, 0.77, 0.395, 0.6, 0.835, 0.785]
20
In [7]:
#print_areas = [[17.5,70],[20,80],[21.2,84.8],[24,96],[32,50,128],
#               [25,64,100],[33.5],[35],[150],[160],[93.5],[108],[130]]
print(print_areas)
print(len(print_areas))
[[17.5, 70.0], [20.0, 80.0, 320.0], [21.2, 84.8], [24.0, 96.0, 150.0, 216.0, 384.0, 600.0], [32.0, 50.0, 128.0, 200.0, 288.0], [25.0, 64.0, 100.0, 144.0, 400.0], [33.5], [35.0], [150.0], [160.0, 360.0], [93.5], [108.0, 432.0], [160.0], [300.0], [176.0], [130.0], [308.0], [240.0], [480.0], [154.0]]
20

Minimum Print Size Area

Any image with a dpi_area below the minimum print size table area does not have enough pixels to print. It's useful to know this value. The following flatten function from Recipe 4.14, Python Cookbook 3rd Ed makes it easy to extract this value.

In [8]:
from collections import Iterable

def flatten(items):
    """Yield items from any nested iterable; see REF."""
    for x in items:
        if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
            yield from flatten(x)
        else:
            yield x
            
min_print_area = min(list(flatten(print_areas)))
print(min_print_area)
17.5
In [9]:
#size_keywords = [['3.5x5','7x10'],['4x5','8x10'],['4x5.3','8x10.6'],
#                 ['4x6','8x12'],['4x8','5x10', '8x16'],['5x5','8x8','10x10'],['5x6.7'],
#                 ['5x7'],['5x30'],['8x20'],['8.5x11'],['9x12'],['10x13']]
print(size_keywords)
print(len(size_keywords))
[['3.5x5', '7x10'], ['4x5', '8x10', '16x20'], ['4x5.3', '8x10.6'], ['4x6', '8x12', '10x15', '12x18', '16x24', '20x30'], ['4x8', '5x10', '8x16', '10x20', '12x24'], ['5x5', '8x8', '10x10', '12x12', '20x20'], ['5x6.7'], ['5x7'], ['5x30'], ['8x20', '12x30'], ['8.5x11'], ['9x12', '18x24'], ['10x16'], ['10x30'], ['11x16'], ['10x13'], ['11x28'], ['12x20'], ['20x24'], ['11x14']]
20
In [10]:
def print_size_key(height, width, *, no_ratio='0z1', no_pixels='0z0', 
                   min_area=17.5, ppi=360, tolerance=0.000005):
    """
    Compute print size key word from image dimensions. 
    The result is a character string.
    
      key360 = print_size_key(2000, 3000)
      
      # (ppi) is identical to dpi here
      key720 = print_size_key(2000, 3000, ppi=720) 
    """
    
    # basic argument check
    error_message = '(height), (width) must be positive integers'
    if not (isinstance(height, int) and isinstance(width, int)):
        raise TypeError(error_message)
    elif height <= 0 or width <= 0:
        raise ValueError(error_message)
    
    # area must exceed a minimum size
    print_area = dpi_area(height, width, dpi=ppi)
    if print_area < min_area:
        return no_pixels
    
    print_ratio = aspect_ratio(height, width)
    print_key = no_ratio
    for i, ratio in enumerate(aspect_ratios):
        if abs(print_ratio - ratio) <= tolerance:
            print_key = no_pixels
            
            # not enough or more than enough area
            if print_area < print_areas[i][0]:
                break
            elif print_area > print_areas[i][-1]:
                print_key = size_keywords[i][-1]
                break     
            
            for j, area in enumerate(print_areas[i]):
                if area >= print_area and 0 < j:
                    print_key = size_keywords[i][j - 1]
                    break
                    
    return print_key
    
# many sizes available for aspect ratio 1.0
print('3800x3800 at 360 DPI = %s' % print_size_key(3800, 3800))
print('3800x3800 at 720 DPI = %s' % print_size_key(3800, 3800, ppi=720))
print('3000x3000 at 360 DPI = %s' % print_size_key(3000, 3000))
print('2000x2000 at 360 DPI = %s' % print_size_key(2000, 2000))

# not enough pixels
print('500x500 at 360 DPI = %s' % print_size_key(500,500))
print('10x10 at 360 DPI = %s' % print_size_key(10,10)) 

# no ratio 
print('3255x4119 at 360 DPI = %s' % print_size_key(3255, 4119))
3800x3800 at 360 DPI = 10x10
3800x3800 at 720 DPI = 5x5
3000x3000 at 360 DPI = 8x8
2000x2000 at 360 DPI = 5x5
500x500 at 360 DPI = 0z0
10x10 at 360 DPI = 0z0
3255x4119 at 360 DPI = 0z1

Testing print_size_key

The print_size_key function seems simple enough but when I see three break statements in a loop I set my bullshit detector to eleven and start looking for bugs.

In [11]:
# exception throwing blocks rerunning all notebook cells
# print_size_key('not', 'even_wrong') # throw exception
In [12]:
# print_size_key(-2, -3) # throw exception
In [13]:
# print_size_key(0, 50) # throw exception
In [14]:
print('0z0' == print_size_key(1,1))      # not enough pixels
print('0z0' == print_size_key(20,20))    # not enough pixels
print('0z0' == print_size_key(500,500))  # not enough pixels
True
True
True
In [15]:
print('0z1' == print_size_key(2000,2100))  # ratio not in table
print('0z1' == print_size_key(4000,3500))  # ratio not in table
print('0z1' == print_size_key(1000,5000))  # ratio not in table
True
True
True

As print_size_key rounds ratios and areas you need slightly more pixels than you might expect for a given print size. In practice this is not an issue as digital images usually have more than enough pixels for small standard size prints.

In [16]:
print('0z0' == print_size_key(int(3.5 * 350), 5 * 350))           # 3.5x5 not enough pixels
print('3.5x5' == print_size_key(int(3.5 * 362), 5 * 362))         # 3.5x5
print('7x10' == print_size_key(7 * 362, 10 * 362))                # 7x10
print('5x6.7' == print_size_key(5 * 362, int(6.7 * 362)))         # 5x6.7
print('8.5x11' == print_size_key(int(8.5 * 362), 11 * 362))       # 8.5x11
print('10x10' == print_size_key(10 * 362, 10 * 362))              # 10x10
print('10x10' == print_size_key(10 * 722, 10 * 722, ppi=720))     # 10x10 at 720 DPI
print('5x30' == print_size_key(5 * 362, 30 * 362))                # 5x30
print('5x10' == print_size_key(5 * 722, 10 * 722, ppi=720))       # 5x10 at 720 DPI
True
True
True
True
True
True
True
True
True
In [17]:
# selected actual SmugMug image dimensions
print(print_size_key(2396,1991))  
print(print_size_key(2585,1736))
print(print_size_key(4573,3259))
print(print_size_key(2800,1999))
0z1
0z1
5x7
5x7

Calculating Print Size Keys for SmugMug Album Manifest Files

In the first notebook of this series I used the SmugMug API to generate folders and files containing SmugMug image metadata stored in CSV TAB delimited files. Now I will read these manifest files and compute print size keys.

In [18]:
import csv

with open('c:\SmugMirror\Places\Overseas\Ghana1970s\manifest-Ghana1970s-Kng6tg-w.txt', 'r') as f:
    reader = csv.DictReader(f, dialect='excel', delimiter='\t')                     
    for row in reader:
        key = row['ImageKey']
        height , width = int(row['OriginalHeight']), int(row['OriginalWidth'])
        size_key = print_size_key(height, width)
        print(key, size_key, height, width)
4wqd5Hr 4x6 3021 2014
K7JKbs8 0z1 2036 3122
nFRxBh2 5x7 2665 3731
xCdD7V8 0z1 2585 1736
sTXnpLm 4x6 2192 3289
VG2s4WG 5x7 3659 2613
kNRs3X8 4x6 1694 2543
Qjs2hr6 4x6 3848 2559
qbXqVgC 4x6 2633 3949
ZdzNXm3 0z1 1162 2506
vF4Bwpg 5x7 2531 3542
7WbqpMj 4x5 3211 2566
2cCVDMK 0z0 1846 2398
36kBgrv 0z1 2396 1991
2FzVqjP 0z0 1887 2398

The print size keys computed by the Python print_size_key function match the keys computed by the following J verb printsizekey.

 printsizekey=:3 : 0

 NB.*printsizekey v-- j version of python (print_size_key).
 NB.
 NB. monad:  st =. printsizekey btclManifest
 NB.
 NB.   mf0=. readtd2 'c:\SmugMirror\Places\Overseas\Ghana1970s\manifest-Ghana1970s-Kng6tg-w.txt'
 NB.   mf1=. readtd2 'c:\SmugMirror\Themes\Diaries\CellPhoningItIn\manifest-CellPhoningItIn-PfCsJz-16.txt'
 NB.   printsizekey mf0
 NB.   printsizekey mf1
 NB.
 NB. dyad:  st =. iaDpi printsizekey btclManifest
 NB.
 NB.   720 printsizekey mf1

 SMUGPRINTDPI printsizekey y
 :
 NB. image keys and dimensions 
 d=. y {"1~ (0{y) i. ;:'ImageKey OriginalHeight OriginalWidth'
 f=. |: _1&".&> d=. 1 2 {"1 }. d
 'invalid image dimensions' assert 0 < ,f

 NB. default print size keys
 'area ratio'=. (SMUGASPECTROUND,SMUGAREAROUND,x) dpiarearatio f 
 keys=. (#ratio) # s: <NORATIOKEY

 NB. print sizes for image ratios
 pst=.  SMUGASPECTROUND printsizestable SMUGPYTERSIZES
 ast=.  ;0{"1 pst
 m0=.   ratio e. ast
 idx=.  (ast i. ratio) -. #ast
 pst=.  idx { pst

 NB. images without enough pixels
 area=. <"0 m0 # area
 m1=.   (1 {"1 pst) <&.> area
 m2=.   +./&> m1
 keys=. (s: <NOPIXELSKEY) (I. m0 #^:_1 -. m2)} keys

 NB. largest print sizes for enough pixels
 sizes=. ,([email protected]&.> m2#m1) {&> 2 {"1 m2#pst
 keys=. sizes(I. m0 #^:_1 m2)} keys

 NB. image keys, print size keys, pixels
 NB. smoutput (<"0  m0 # keys) ,. area ,. pst 
 (s: }.0 {"1 y) , keys , |: s: d 
 )

Invoking J within Jupyter

Using a J addon we can run the J verb and compare its output to the Python result. The next cell assumes jcore.py and jbase.py are on Python's sys.path.

In [19]:
import jcore as j

j.init(True)     # start j
j.dor('i. 2 7')  # ping j
0 1 2  3  4  5  6
7 8 9 10 11 12 13

Open the JOD Dictionary that contains printsizekeys and fetch the words required to run it.

In [20]:
j.dor("require 'general/jod'")                     # load JOD addon
j.dor("od ;:'smugpyter smugdev image smug utils' [ 3 od '' ") # open image dictionaries
j.dor("getrx ;:'printsizekey fmtkeys'")            # get everything required to execute
+-+--------------------------+---------+-------+-----+----+-----+
|1|opened (rw/ro/ro/ro/ro) ->|smugpyter|smugdev|image|smug|utils|
+-+--------------------------+---------+-------+-----+----+-----+
+-+------------------------------+
|1|(16) words loaded into -> base|
+-+------------------------------+
In [21]:
j.dor('35 list SMUGPYTERSIZES') # show printsizes table in J 
3.5x5  4x5    4x5.3  4x6    4x8    
5x5    5x6.7  5x7    5x10   5x30   
7x10   8x8    8x10   8x10.6 8x12   
8x16   8x20   8.5x11 9x12   10x10  
10x13  10x15  10x16  10x20  10x30  
11x14  11x16  11x28  12x12  12x18  
12x20  12x24  12x30  16x20  16x24  
18x24  20x20  20x24  20x30         

Read the manifest file into J and compute the print size keys.

In [22]:
j.dor("mf0=. readtd2 'c:\SmugMirror\Places\Overseas\Ghana1970s\manifest-Ghana1970s-Kng6tg-w.txt'")
j.dor('fmtkeys printsizekey mf0')
4wqd5Hr  4x6  3021  2014
K7JKbs8  0z1  2036  3122
nFRxBh2  5x7  2665  3731
xCdD7V8  0z1  2585  1736
sTXnpLm  4x6  2192  3289
VG2s4WG  5x7  3659  2613
kNRs3X8  4x6  1694  2543
Qjs2hr6  4x6  3848  2559
qbXqVgC  4x6  2633  3949
ZdzNXm3  0z1  1162  2506
vF4Bwpg  5x7  2531  3542
7WbqpMj  4x5  3211  2566
2cCVDMK  0z0  1846  2398
36kBgrv  0z1  2396  1991
2FzVqjP  0z0  1887  2398

The J verb and the Python function use completely different approaches but arrive at the same result. If you really care about the answer do it more than once and practice relentless verification!

The following functions generalize setting print size keywords for manifest files.

In [23]:
test0 = 'go;ahead;test me;boo    hoo  ; you     are   so; 0x0; utterly  wrong ; united states'
test1 = 'all_right; alll_right; allll_right'

def standard_keywords(keywords, *, blank_fill='_', 
                      split_delimiter=';',
                      substitutions=[('united_states','usa')]):
    """
    Return a list of keywords in standard form.
    
    Reduces multiple blanks to one, converts to lower case, and replaces
    any remaining blanks with (blank_fill). This insures keywords are contigous
    lower case or hypenated lower case character runs.
    
    Note: the odd choice of '_' for the blank fill is because hyphens appear
    to be stripped from keywords on SmugMug.
    
        standard_keywords('go;ahead;test me;boo    hoo  ; you   are   so; 0x0; united   states')
    """
    
    # basic argument check
    error_message = '(keywords) must be a string'
    if not isinstance(keywords, str):
        raise TypeError(error_message)
        
    if len(keywords.strip(' ')) == 0:
        return []
    else:
        keys = ' '.join(keywords.split())                         
        keys = split_delimiter.join([s.strip().lower() for s in keys.split(split_delimiter)])
        keys = ''.join(blank_fill if c == ' ' else c for c in keys)
        # replace some keywords with others
        for k, s in substitutions:
            keys = keys.replace(k, s)
        # return sorted list - move size keys to front     
        keylist = [s for s in keys.split(split_delimiter)]
        return sorted(keylist)

print(standard_keywords(test0))
print(standard_keywords(''))
print(standard_keywords('    '))
print(standard_keywords(test1))
['0x0', 'ahead', 'boo_hoo', 'go', 'test_me', 'usa', 'utterly_wrong', 'you_are_so']
[]
[]
['all_right', 'alll_right', 'allll_right']
In [24]:
import re

def update_size_keyword(size_keyword, keywords, split_delimiter=';'):
    """
    Update the print size keyword for a single image
    and standardize the format of any remaining keywords.
    Result is a (boolean, string) tuple.
    """
    # basic argument check
    error_message = '(size_keyword), (keywords) must be nonempty strings'
    if not (isinstance(size_keyword, str) and isinstance(keywords, str)):
        raise TypeError(error_message)
    elif len(size_keyword.strip(' ')) == 0:
        raise ValueError(error_message)
    
    if len(keywords.strip(' ')) == 0:
        return (False, size_keyword)
    
    inkeys = [s.strip().lower() for s in keywords.split(split_delimiter)]
    if 0 == len(inkeys):
        return (False, size_keyword)
    
    outkeys = [size_keyword]
    for inword in inkeys:
        # remove any existing print size keys
        if re.match(r"\d+(\.\d+)?[xz]\d+(\.\d+)?", inword) is not None:
            continue
        else:
            outkeys.append(inword)
            
    # return standard unique sorted keys
    outkeys = sorted(list(set(outkeys)))
    outkeys = standard_keywords(split_delimiter.join(outkeys))
    return (set(outkeys) == set(inkeys), (split_delimiter+' ').join(outkeys))

def print_keywords(manifest_file):
    """
    Set print size keywords for images in album manifest file.
    Result is a tuple (image_count, change_count, changed_keywords).
    (changed_keyords) is a list of dictionaries in (csv.DictWriter) format.
    """
    changed_keywords = []
    image_count , change_count = 0 , 0
    with open(manifest_file, 'r') as f:
        reader = csv.DictReader(f, dialect='excel', delimiter='\t')                     
        for row in reader:
            image_count += 1
            key = row['ImageKey']
            height , width = int(row['OriginalHeight']), int(row['OriginalWidth'])
            size_key = print_size_key(height, width)
            same, keywords = update_size_keyword(size_key, row['Keywords'])
            if not same:
                change_count += 1
                changed_keywords.append({'ImageKey': key, 'AlbumKey': row['AlbumKey'],
                                       'FileName': row['FileName'], 'Keywords': keywords,
                                       'OldKeywords': row['Keywords']})
                
    # when no images are changed return a header place holder row
    if change_count == 0:
        changed_keywords.append({'ImageKey': None, 'AlbumKey': None, 'FileName': None, 
                                 'Keywords': None, 'OldKeywords': None})
        
    return (image_count, change_count, changed_keywords)
In [25]:
print_keywords('c:\SmugMirror\Places\Overseas\Ghana1970s\manifest-Ghana1970s-Kng6tg-w.txt')
Out[25]:
(15,
 0,
 [{'AlbumKey': None,
   'FileName': None,
   'ImageKey': None,
   'Keywords': None,
   'OldKeywords': None}])

Testing update_size_keyword

In [26]:
# update_size_keyword('4x5', 3)  # throw exception
In [27]:
# update_size_keyword('', ' ok; but; size; key; bad')  # throw exception
In [28]:
print('4x6' == update_size_keyword('4x6', '     ')[1])
print('4x6; boo' == update_size_keyword('4x6', 'boo')[1]) 
print('4x6; aha; boo; boys' == update_size_keyword('4x6', 'aha; boo; BOO; boo; boys')[1])
print('4x6' == update_size_keyword('4x6', '5x7; 8x12; 3x4; 3.5x5')[1]) 
print('4x6; boo; home; yo' == update_size_keyword('4x6', '5x7; 8x12; 3x4; 3.5x5; yo; yo; home; BOO')[1])
print(update_size_keyword('4x6', '4x6; boo; hoo; too')[0]) # no keyword changes
True
True
True
True
True
True

Posting SmugMug Print Size Keywords

The next step is to post the computed print size keywords to SmugMug. For this, we need an API call that sets keywords. The SmugPyter class does have a keyword setting function. We will have to fake it.

In case you are wondering, faking it is a fundamental skill that all programmers must master. Remember how Scotty in the original Star Trek series constantly told Kirk that he couldn't sustain high warp without wreaking the Enterprise but somehow always managed to do it and walk away intact. Sure the Enterprise wasn't designed for the stresses it was forced to endure but Scotty hacked it on the fly.

A lot of programming is like that. You're working with half-baked buggy tools that will not sustain warp but you have to pull it off. Be grateful you're not dodging photon torpedoes.

In [29]:
import os 

def album_id_from_file(filename):
    """
    Extracts the (album_id, name, mask) from file names. 
    Depends on file naming conventions.
    
        album_id_from_file('c:\SmugMirror\Places\Overseas\Ghana1970s\manifest-Ghana1970s-Kng6tg-w.txt')    
    """
    mask, album_id, name = filename.split('-')[::-1][:3]
    mask = mask.split('.')[0]
    return (smugmug.case_mask_decode(album_id, mask), name, mask)

manifest_file = 'c:\SmugMirror\Places\Overseas\Ghana1970s\manifest-Ghana1970s-Kng6tg-w.txt'
album_id_from_file(manifest_file)
Out[29]:
('Kng6tg', 'Ghana1970s', 'w')
In [30]:
def changes_filename(manifest_file):
    """
    Changes file name from manifest file name.
    """
    album_id, name, mask = album_id_from_file(manifest_file)
    path = os.path.dirname(manifest_file)
    changes_name = "changes-%s-%s-%s" % (name, album_id, mask)
    changes_file = path + "/" + changes_name + '.txt'
    return changes_file
    
def write_size_keyword_changes(manifest_file):
        """
        Write TAB delimited file of changed metadata.
        Return album and keyword (image_count, change_count) tuple.
        
            manifest_file = 'c:\SmugMirror\Places\Overseas\Ghana1970s\manifest-Ghana1970s-Kng6tg-w.txt'
            write_size_keyword_changes(manifest_file)  
        """
        image_count, change_count, keyword_changes = print_keywords(manifest_file)
        changes_file = changes_filename(manifest_file)
        keys = keyword_changes[0].keys()
        with open(changes_file, 'w', newline='') as output_file:
            dict_writer = csv.DictWriter(output_file, keys, dialect='excel-tab')
            dict_writer.writeheader()
            # for no changes write header only
            if change_count > 0:
                dict_writer.writerows(keyword_changes)    
        return(image_count, change_count)
            
write_size_keyword_changes(manifest_file)
Out[30]:
(15, 0)
In [31]:
def update_all_keyword_changes_files(root):
    """
    Scan all manifest files in local directories and
    generate TAB delimited CSV keyword changes files.
    """
    total_images , total_changes = 0 , 0
    pattern = "manifest-"
    alist_filter = ['txt'] 
    for r,d,f in os.walk(root):
        for file in f:
            if file[-3:] in alist_filter and pattern in file:
                file_name = os.path.join(root,r,file)
                image_count, change_count = write_size_keyword_changes(file_name)
                if change_count > 0:
                    print(file_name)
                total_images += image_count
                total_changes += change_count
    print('image count %s, change count %s' % (total_images, total_changes))
In [32]:
# %timeit update_all_keyword_changes_files('c:\SmugMirror')
update_all_keyword_changes_files('c:\SmugMirror')
image count 4254, change count 0

Issuing SmugMug API PATCH Requests

Now that the CSV change files are ready the next step is to read them and reset keywords. You can do this with a SmugMug PATCH request.

My attempts to issue PATCH requests did not meet with a lot of success until I traded a few emails with the SmugMug API support team at [email protected]. They advised me to turn off redirects. It was a simple parameter setting but it would have taken me days to figure it on my own. Kudos to the excellent API support at SmugMug.

In [33]:
import requests
import json
from requests_oauthlib import OAuth1
In [34]:
auth = OAuth1(smugmug.consumer_key, smugmug.consumer_secret, 
              smugmug.access_token, smugmug.access_token_secret, smugmug.username)
In [35]:
# attempt to set keywords
r = requests.patch(url='https://api.smugmug.com/api/v2/image/8rjZsTB',
                   auth=auth,
                   data=json.dumps({"Keywords": "these; are; brand; spanking; new; keywords"}),
                   headers={'Accept':'application/json','Content-Type':'application/json'},
                   allow_redirects=False)
In [36]:
def change_image_keywords(image_id, keywords):
    r = requests.patch(url='https://api.smugmug.com/api/v2/image/' + image_id,
                       auth=auth,
                       data=json.dumps({"Keywords": keywords}),
                       headers={'Accept':'application/json','Content-Type':'application/json'},
                       allow_redirects=False)
    if r.status_code != 301:
        raise Exception("Not what the doctor ordered")
    
    return 'changed'
        
In [37]:
change_image_keywords('8rjZsTB', 'more; new; keywords; ehh')
Out[37]:
'changed'
In [38]:
def change_keywords(changes_file):
    """
    Change keywords for images in album changes file.
    """
    change_count = 0
    with open(changes_file, 'r') as f:
        reader = csv.DictReader(f, dialect='excel', delimiter='\t')                     
        for row in reader:
            change_count += 1
            image_key = row['ImageKey']
            keywords = row['Keywords']
            #print(key, keywords)
            change_image_keywords(image_key, keywords)
    return change_count

change_keywords('c:/SmugMirror/Other/utilimages/changes-utilimages-GMLn9k-1k.txt')
Out[38]:
0

Once an album's print size keywords have been changed regenerating the print size keywords changes files should result in a file with no pending changes.

Note: posted keyword changes appear to become immediately active on SmugMug but immediately re-pulling them returns the prior keyword list. This may be a SmugMug server update issue. I will check later.

P.S. it takes a day or two for all keyword changes to percolate through SmugMug's servers. When I rescanned keywords a day or so after a mass update all my change files were emptied. This is exactly what I was expecting.

In [39]:
write_size_keyword_changes('c:/SmugMirror/Other/utilimages/manifest-utilimages-GMLn9k-1k.txt')
Out[39]:
(107, 0)
In [40]:
def update_all_keyword_changes(root):
    """
    Scan all changes files in local directories
    and apply keyword changes.
    """
    total_changes = 0
    pattern = "changes-"
    alist_filter = ['txt'] 
    for r,d,f in os.walk(root):
        for file in f:
            if file[-3:] in alist_filter and pattern in file:
                change_count = change_keywords(os.path.join(root,r,file))
                total_changes += change_count
    print('change count %s' % total_changes)
In [41]:
# takes awhile to plow through thousands of updates
update_all_keyword_changes('c:\SmugMirror')
change count 0

Setting a geotagged Keyword

Now that we can easily set keywords. It's a simple matter to scan the manifest files and set a geotagged keyword for all images that have nonzero latitude and longitude. The most common latitude, longitude and altitude value in the manifest files is the default (0,0,0). If you look at a map you'll see this coordinate is in Atlantic ocean off the west coast of Africa. I have taken exactly zero pictures at this location.

In [42]:
def geotag_images(manifest_file, *, split_delimiter=';', geotag_key='geotagged'):
    """
    Sets a geotagged keyword for nongeotagged images with nonzero latitude or longitude.
    """
    change_count = 0
    with open(manifest_file, 'r') as f:
        reader = csv.DictReader(f, dialect='excel', delimiter='\t')                     
        for row in reader:
            key = row['ImageKey']
            latitude = float(row['Latitude'])
            longitude = float(row['Longitude'])
            if latitude != 0.0 or longitude != 0.0:
                keywords = row['Keywords']
                inkeys = [s.strip().lower() for s in keywords.split(split_delimiter)]
                
                # if an image is already geotagged skip it 
                if geotag_key in inkeys:
                    continue
                    
                outkeys = sorted(list(set(inkeys)))
                outkeys.append(geotag_key)
                new_keywords = (split_delimiter+' ').join(outkeys)
                outkeys = standard_keywords(new_keywords, split_delimiter=split_delimiter) 
                same, new_keywords = (set(outkeys) == set(inkeys), (split_delimiter+' ').join(outkeys))
                if not same:
                    change_count += 1   
                    #print(manifest_file)
                    #print(key, new_keywords)
                    change_image_keywords(key, new_keywords)
    return change_count

geotag_images('c:\SmugMirror\Places\Overseas\Ghana1970s\manifest-Ghana1970s-Kng6tg-w.txt')
Out[42]:
0
In [43]:
def set_all_geotags(root):
    """
    Scan all manifest files in local directories and set
    geotags for images with nonzero latitude or longitude
    that are not geotagged.
    """
    total_changes = 0
    pattern = "manifest-"
    alist_filter = ['txt'] 
    for r,d,f in os.walk(root):
        for file in f:
            if file[-3:] in alist_filter and pattern in file:
                file_name = os.path.join(root,r,file)
                change_count = geotag_images(file_name)
                total_changes += change_count
    print('change count %s' % total_changes)
    
set_all_geotags('c:\SmugMirror')
change count 0

Setting Reverse Geocode Keywords

As a final example of setting SmugMug keywords let's reverse geocode images with nonzero latitude and longitude. Reverse geocoding is the dark art of taking a latitude and longitude and turning it into a standard place name. That evil SJW infested warren of privacy invading weasels known as Google has a free request limited API that reverse geocodes. You can ping this API a few times without an API key but to do anything remotely serious you need an API key. API keys's come in two flavors: free and not free. Let's try free first.

If you obtain a free Google Maps API key you can make 2,500 API calls per day. I currently have roughly a thousand geotagged images on SmugMug. With a little care I should be able to reverse geocode my images in a day or two.

Google provides a Python Google maps API. I looked over the code and decided it was overkill. I poked around and found a blog post Batch CSV Geocoding in Python with Google Maps API that basically outlines what I want to do here. Shane's post describes geocoding. When geocoding you supply a place name and turn in into a latitude and longitude. I want the reverse, hence the name "reverse geocoding."

Add Your Maps API Key to the SmugPyter Config File

After getting my Google Maps API key I added it to the SmugPyter configuration under a new [GOOGLEMAPS] section.

In [44]:
smugmug.smugmug_config
Out[44]:
'C:\\Users\\john\\.smugpyter.cfg'
In [45]:
_ = smugmug.google_maps_key

Make a Maps API Request

In [46]:
# set up reverse geocoding test urls
latlng0 = '45.39584,-113.98174'  # Idaho
latlng1 = '9.39672,-0.81673'     # Iran
latlng2 = '45.35997,-75.71876'   # Canada
reverse_geocode_url0 = "https://maps.googleapis.com/maps/api/geocode/json?latlng={}".format(latlng0)
reverse_geocode_url1 = "https://maps.googleapis.com/maps/api/geocode/json?latlng={}".format(latlng1)
reverse_geocode_url2 = "https://maps.googleapis.com/maps/api/geocode/json?latlng={}".format(latlng2)
if smugmug.google_maps_key is not None:
    reverse_geocode_url0 = reverse_geocode_url0 + "&key={}".format(smugmug.google_maps_key)
    reverse_geocode_url1 = reverse_geocode_url1 + "&key={}".format(smugmug.google_maps_key)
    reverse_geocode_url2 = reverse_geocode_url2 + "&key={}".format(smugmug.google_maps_key)

#print(reverse_geocode_url0)
In [47]:
# ping google - remember you only get 2,500 freebies per day.
results0 = requests.get(reverse_geocode_url0)
results0 = results0.json()
results1 = requests.get(reverse_geocode_url1)
results1 = results1.json()
results2 = requests.get(reverse_geocode_url2)
results2 = results2.json()
In [48]:
results0["results"]
Out[48]:
[{'address_components': [{'long_name': '1952-1982',
    'short_name': '1952-1982',
    'types': ['street_number']},
   {'long_name': 'Casey Road', 'short_name': 'US-93', 'types': ['route']},
   {'long_name': 'Salmon',
    'short_name': 'Salmon',
    'types': ['locality', 'political']},
   {'long_name': 'Lemhi County',
    'short_name': 'Lemhi County',
    'types': ['administrative_area_level_2', 'political']},
   {'long_name': 'Idaho',
    'short_name': 'ID',
    'types': ['administrative_area_level_1', 'political']},
   {'long_name': 'United States',
    'short_name': 'US',
    'types': ['country', 'political']},
   {'long_name': '83467', 'short_name': '83467', 'types': ['postal_code']}],
  'formatted_address': '1952-1982 US-93, Salmon, ID 83467, USA',
  'geometry': {'bounds': {'northeast': {'lat': 45.3987156,
     'lng': -113.9803244},
    'southwest': {'lat': 45.39577569999999, 'lng': -113.9847687}},
   'location': {'lat': 45.396023, 'lng': -113.9815654},
   'location_type': 'RANGE_INTERPOLATED',
   'viewport': {'northeast': {'lat': 45.3987156, 'lng': -113.9803244},
    'southwest': {'lat': 45.39577569999999, 'lng': -113.9847687}}},
  'place_id': 'EikxOTUyLTE5ODIgQ2FzZXkgUmQsIFNhbG1vbiwgSUQgODM0NjcsIFVTQQ',
  'types': ['street_address']},
 {'address_components': [{'long_name': '83467',
    'short_name': '83467',
    'types': ['postal_code']},
   {'long_name': 'Salmon',
    'short_name': 'Salmon',
    'types': ['locality', 'political']},
   {'long_name': 'Lemhi County',
    'short_name': 'Lemhi County',
    'types': ['administrative_area_level_2', 'political']},
   {'long_name': 'Idaho',
    'short_name': 'ID',
    'types': ['administrative_area_level_1', 'political']},
   {'long_name': 'United States',
    'short_name': 'US',
    'types': ['country', 'political']}],
  'formatted_address': 'Salmon, ID 83467, USA',
  'geometry': {'bounds': {'northeast': {'lat': 45.4116059,
     'lng': -113.4491199},
    'southwest': {'lat': 44.69231389999999, 'lng': -114.2212051}},
   'location': {'lat': 44.9479845, 'lng': -113.9660111},
   'location_type': 'APPROXIMATE',
   'viewport': {'northeast': {'lat': 45.4116059, 'lng': -113.4491199},
    'southwest': {'lat': 44.69231389999999, 'lng': -114.2212051}}},
  'place_id': 'ChIJORdJq_krWFMRJ_DosdS1_KQ',
  'types': ['postal_code']},
 {'address_components': [{'long_name': 'Lemhi County',
    'short_name': 'Lemhi County',
    'types': ['administrative_area_level_2', 'political']},
   {'long_name': 'Idaho',
    'short_name': 'ID',
    'types': ['administrative_area_level_1', 'political']},
   {'long_name': 'United States',
    'short_name': 'US',
    'types': ['country', 'political']}],
  'formatted_address': 'Lemhi County, ID, USA',
  'geometry': {'bounds': {'northeast': {'lat': 45.705883, 'lng': -112.813604},
    'southwest': {'lat': 44.230235, 'lng': -114.8201151}},
   'location': {'lat': 45.0364592, 'lng': -113.9230554},
   'location_type': 'APPROXIMATE',
   'viewport': {'northeast': {'lat': 45.705883, 'lng': -112.813604},
    'southwest': {'lat': 44.230235, 'lng': -114.8201151}}},
  'place_id': 'ChIJ0792dKEnWFMR9q9wjunaUDo',
  'types': ['administrative_area_level_2', 'political']},
 {'address_components': [{'long_name': 'Idaho',
    'short_name': 'ID',
    'types': ['administrative_area_level_1', 'political']},
   {'long_name': 'United States',
    'short_name': 'US',
    'types': ['country', 'political']}],
  'formatted_address': 'Idaho, USA',
  'geometry': {'bounds': {'northeast': {'lat': 49.0011461, 'lng': -111.043495},
    'southwest': {'lat': 41.9880051, 'lng': -117.243027}},
   'location': {'lat': 44.0682019, 'lng': -114.7420408},
   'location_type': 'APPROXIMATE',
   'viewport': {'northeast': {'lat': 49.0011461, 'lng': -111.043495},
    'southwest': {'lat': 41.9880051, 'lng': -117.243027}}},
  'place_id': 'ChIJ6Znkhaj_WFMRWIf3FQUwa9A',
  'types': ['administrative_area_level_1', 'political']},
 {'address_components': [{'long_name': 'United States',
    'short_name': 'US',
    'types': ['country', 'political']}],
  'formatted_address': 'United States',
  'geometry': {'bounds': {'northeast': {'lat': 71.5388001, 'lng': -66.885417},
    'southwest': {'lat': 18.7763, 'lng': 170.5957}},
   'location': {'lat': 37.09024, 'lng': -95.712891},
   'location_type': 'APPROXIMATE',
   'viewport': {'northeast': {'lat': 49.38, 'lng': -66.94},
    'southwest': {'lat': 25.82, 'lng': -124.39}}},
  'place_id': 'ChIJCzYy5IS16lQRQrfeQ5K5Oxw',
  'types': ['country', 'political']}]
In [49]:
# extract only state or province (admistrative level 1) and country
if results0["status"] == "OK" and results1["status"] == "OK" and results2["status"] == "OK":
    print(results0["results"][-2]['formatted_address'])
    print(results1["results"][-2]['formatted_address'])
    print(results2["results"][-2]['formatted_address'])
Idaho, USA
Northern Region, Ghana
Ontario, Canada
In [50]:
state_country = results2["results"][-2]['formatted_address']
reverse_keys = [s.strip().lower() for s in state_country.split(',')]
print(reverse_keys)
['ontario', 'canada']
In [51]:
def reverse_geocode(latitude, longitude):
    """
    Returns state or province and country keywords from latitude and longitude.
    """
    count_reverse_codes = (0, [])
    latlng = '%s,%s' % (latitude, longitude)
    reverse_geocode_url = "https://maps.googleapis.com/maps/api/geocode/json?latlng=%s&key=%s"
    reverse_geocode_url = reverse_geocode_url % (latlng, smugmug.google_maps_key)
    results = requests.get(reverse_geocode_url)
    results = results.json()
    
    if results["status"] == "OK":
        try:
            state_country = results["results"][-2]['formatted_address']
            reverse_keys = standard_keywords(state_country, split_delimiter=',')
            count_reverse_codes = (len(reverse_keys), reverse_keys)
        except Exception as e:
            # ignore any errors - no reverse geocodes for you
            count_reverse_codes = (0, [])
            print('unable to reverse geocode %s' % latlng)
    
    return count_reverse_codes

print(reverse_geocode(45.39584,-113.98174))
print(reverse_geocode(40.76814,-111.88988))  # some usa locations report united_states - remap to usa
(2, ['idaho', 'usa'])
(1, ['usa'])
In [52]:
def reverse_geocode_images(manifest_file, *, split_delimiter=';', geotag_key='geotagged'):
    """
    Reverse geocode images with nonzero latitude and longitude.
    """
    change_count = 0
    with open(manifest_file, 'r') as f:
        reader = csv.DictReader(f, dialect='excel', delimiter='\t')                     
        for row in reader:
            key = row['ImageKey']
            latitude = float(row['Latitude'])
            longitude = float(row['Longitude'])
            if latitude != 0.0 or longitude != 0.0:
                keywords = row['Keywords']
                inkeys = [s.strip().lower() for s in keywords.split(split_delimiter)]
                
                # if an image is already geotagged skip it - edit the
                # changes file and strip (geotag_key) to reprocess
                if geotag_key in inkeys:
                    continue
                    
                reverse_count , reverse_keywords = reverse_geocode(latitude, longitude)
                if reverse_count == 0:
                    continue
                else:     
                    outkeys = inkeys + reverse_keywords
                    outkeys.append(geotag_key)
                    outkeys = sorted(list(set(outkeys)))
                    new_keywords = (split_delimiter+' ').join(outkeys)
                    outkeys = standard_keywords(new_keywords, split_delimiter=split_delimiter) 
                    same, new_keywords = (set(outkeys) == set(inkeys), (split_delimiter+' ').join(outkeys))
                    if not same:
                        print(reverse_keywords)
                        change_count += 1   
                        change_image_keywords(key, new_keywords)
    return change_count

#reverse_geocode_images('c:\SmugMirror\Places\Overseas\Ghana1970s\manifest-Ghana1970s-Kng6tg-w.txt')
In [53]:
 def set_all_reverse_geocodes(root):
    """
    Scan all manifest files in local directories and set
    reverse geocode keys for nongeotagged images with nonzero
    latitude or longitude.
    
    Note: limited to 2,500 free Google geocode API calls per day.
    """
    total_changes = 0
    pattern = "manifest-"
    alist_filter = ['txt'] 
    for r,d,f in os.walk(root):
        for file in f:
            if file[-3:] in alist_filter and pattern in file:
                change_count = reverse_geocode_images(os.path.join(root,r,file))
                total_changes += change_count
    print('change count %s' % total_changes)
    
set_all_reverse_geocodes('c:\SmugMirror')
change count 0

Next on the Agenda!

Now that I have worked through a proof of concept the next notebook will condense and refine the code in this notebook into a SmugPyter print size keyword setting subclass.

Remember, always Analyze the Data not the Drivel.

John Baker, Meridian Idaho