This Notebook is an example session launching a sample IPython cluster in the Joyent Public Cloud.

It is a part of a blog post on automatic deployment of an IPython parallel compute cluster on Joyent, so contextual details are there.

You should install the Python SmartDC libraries first with:

pip install smartdc
In [1]:
import time, sys
import json
from smartdc import DataCenter
import paramiko

Select your datacenter, select your username and key identifier:

In [2]:
sdc = DataCenter('eu-ams-1', key_id='/username/keys/keyname', verbose=True)

I chose the most up-to-date base 64-bit image and the smallest package available.

The boot script is available on this gist.

In [3]:
ipc = sdc.create_machine(dataset='sdc:sdc:base64:', package='Extra Small 512 MB', 
  tags={'cluster': 'ipython', 'role': 'controller'},
  boot_script='./', name='ipcontroller')
2013-03-07T10:07:49.372832	POST
2013-03-07T10:07:58.617559	GET
2013-03-07T10:08:02.459808	GET
2013-03-07T10:08:05.942330	GET
2013-03-07T10:08:09.668697	GET
2013-03-07T10:08:13.213666	GET
2013-03-07T10:08:16.890286	GET
2013-03-07T10:08:20.323838	GET
2013-03-07T10:08:23.840497	GET
2013-03-07T10:08:27.553048	GET
2013-03-07T10:08:31.015873	GET

This function is a roadblock, pausing further operation until the boot script completely finishes running, and we have some confidence that the controller’s engine configurations have been written.

In [4]:
def wait_for_svc(connection, fmri, interval=3, timeout=315):
    SERVICE_POLL = 'svcs -H -o STA,NSTA %s' % fmri
    for _ in xrange(timeout//interval):
        states = tuple(connection.exec_command(SERVICE_POLL)[1].read().strip().split())
        if states == ('ON', '-'):
            print >>sys.stderr
            return True
        elif states == ('MNT', '-'):
            raise StandardError('Bootscript failed: now in maintenance mode')
        elif states == ('OFF', 'ON'):
            # heartbeat
            print >>sys.stderr, '.',
            # slightly unusual state
            print >>sys.stderr, '?',
    raise StandardError('Timeout')
In [6]:
ssh_conn = paramiko.SSHClient()
ssh_conn.connect(ipc.public_ips[0], username='root')
In [7]:
wait_for_svc(ssh_conn, 'svc:/smartdc/mdata:execute')


The key command in this sequence is the following, where we retrieve the metadata that the engines need in order to connect to the engine.

In [8]:
_, rout, _ = ssh_conn.exec_command('cat /opt/local/share/ipython/profile_default/security/ipcontroller-engine.json')
ipcontroller = json.load(rout)

This is similar to the controller provisioning, except that it uses the engine boot script from the gist linked above, it defines a different role, and doesn't bother naming the engines.

In [9]:
for _ in xrange(4):
      package='Extra Small 512 MB', 
      metadata={'ipython.url': ipcontroller['url'], 
                'ipython.key': ipcontroller['exec_key']},
      tags={'cluster': 'ipython', 'role': 'engine'}, 
2013-03-07T10:11:45.336342	POST
2013-03-07T10:12:13.032310	POST
2013-03-07T10:12:19.690627	POST
2013-03-07T10:12:26.242430	POST
In [10]:
In [11]:

At this point, you could SSH into the IPython Controller node, and test your cluster. Note that the IPYTHONDIR environment variable should be set to /opt/local/share/ipython.

Once you have finished with the cluster, you will want to shut it down at some point. Here is a convenient way to delete the entire cluster:

In [12]:
cluster = sdc.machines(tags={'cluster': 'ipython'})
2013-03-07T10:12:33.045438	GET
[<smartdc.machine.Machine: <ipcontroller> in <DataCenter: eu-ams-1>>,
 <smartdc.machine.Machine: <4eac1ad> in <DataCenter: eu-ams-1>>,
 <smartdc.machine.Machine: <d8f2ae6> in <DataCenter: eu-ams-1>>,
 <smartdc.machine.Machine: <19666c2> in <DataCenter: eu-ams-1>>,
 <smartdc.machine.Machine: <873fde1> in <DataCenter: eu-ams-1>>]
In [13]:
from operator import methodcaller
In [14]:
map(methodcaller('stop'), cluster)
2013-03-07T10:19:18.615269	POST
2013-03-07T10:19:22.007255	POST
2013-03-07T10:19:23.819118	POST
2013-03-07T10:19:25.664591	POST
2013-03-07T10:19:27.198584	POST
[None, None, None, None, None]
In [15]:
map(methodcaller('poll_until', 'stopped'), cluster)
2013-03-07T10:19:28.741289	GET
2013-03-07T10:19:32.274923	GET
2013-03-07T10:19:35.843260	GET
2013-03-07T10:19:37.647634	GET
2013-03-07T10:19:39.183008	GET
2013-03-07T10:19:40.722421	GET
2013-03-07T10:19:42.255855	GET
[None, None, None, None, None]
In [16]:
map(methodcaller('delete'), cluster)
2013-03-07T10:19:43.798491	DELETE
2013-03-07T10:19:45.330310	DELETE
2013-03-07T10:19:46.698550	DELETE
2013-03-07T10:19:48.107172	DELETE
2013-03-07T10:19:49.630606	DELETE
[None, None, None, None, None]
In [18]:
2013-03-07T10:20:15.112498	GET