In this tutorial, automated functions of Simple Azure enable IPython Cluster on Windows Azure within a few steps.
We use Azure Data Science Core here to deploy ipython installed virtual machines on Windows Azure.
from simpleazure import SimpleAzure as saz
azure = saz()
azure.asm.get_config()
In a previous tutorial, you may learn how to deploy ADSC with Simple Azure. We use same steps here to have IPython Cluster.
If you want to have more or less nodes, simply change the number of create_cluster() function.
adsc = azure.asm.get_registered_image(name="Azure-Data-Science-Core")
azure.asm.set_image(image=adsc)
azure.asm.set_location("West Europe")
azure.asm.create_cluster(3)
Simple Azure will load IPython Cluster through its plugin. plugin/
directory will contain a plugin for an external software like IPython.
from simpleazure.plugin import ipython
ipy = ipython.IPython()
IPython Cluster will use SSH tunneling for communication between a master and engine node(s) so SSH setting is required first.
ipy.set_username(azure.get_username())
ipy.set_private_key(azure.get_pkey())
The master and engine node(s) should be defined.
We will get the information from azure object that created the cluster.
from simpleazure import config
master = config.get_azure_domain(azure.results['master'])
engines = [ config.get_azure_domain(x) for x in azure.results.keys()]
Then, we assign the names to ipython plugin.
ipy.set_master(master)
ipy.set_engines(engines)
Now we are ready to initialize IPython Cluster through SSH. There are some functions to do this task.
ipy.init_ssh()
init_ssh() above makes paramiko objects to establish ssh.
connect_nodes() actually make connections to nodes.
ipy.connect_nodes()
We will use a new profile for this cluster.
ipy.create_profile()
Once you created the profile, you can run ipcontroller on the master node.
ipy.run_ipcontroller()
You need to let engine nodes know who the master is.
ipcontroller-engin.json file on the master node helps get the information.
We will copy the file to each node.
ipy.copy_pkey_to_nodes() # <- Temporary function to distribute id_rsa private key to node(s)
ipy.copy_json2engines()
It's close. The last step is to execute ipengine on each engine node so let them communicate with the master.
ipy.run_ipengine()
It's finally done. You can now access to the master node and use IPython.parallel.Client module.
Note. thses steps can be replaced with a single wrapper function apply_ipcluster().
ipy.apply_ipcluster(azure)