Using cloudknot to perform matrix-vector multiplication of random matrices

This example uses cloudknot to perform matrix-vector multiplication of some random matrices with varying standard deviations.

In [1]:
import cloudknot as ck

First, we write the python script that we want to run on AWS batch. Note that we import the necessary python packages within the function random_mv_prod.

In [2]:
def random_mv_prod(b):
    import numpy as np
    
    x = np.random.normal(0, b, 1024)
    A = np.random.normal(0, b, (1024, 1024))
    
    return np.dot(A, x)

Create a knot using the random_mv_prod function and a job definition memory of 128 MiB.

In [3]:
knot = ck.Knot(name='random_mv_prod', func=random_mv_prod, memory=128, retries=3)

Submit 20 batch jobs to the knot. The map() method returns a list of futures for the results of each batch job. You can optionally supply a list of environment variables to each job.

In [5]:
# import numpy since it was only imported in the `random_mv_prod` function above
import numpy as np
In [4]:
# Submit the jobs
result_futures = knot.map(np.linspace(0.1, 100, 20), env_vars=[{'name': 'MY_ENV_VAR', 'value': 'foo'}])

We can query the jobs associated with this knot by calling knot.view_jobs(), prints a bunch of job info and provides a consice summary of job statuses.

In [15]:
# Rerun this cell as often as you like to update your job status info
knot.view_jobs()
Job ID              Name                        Status   
---------------------------------------------------------
57e9d725-3aab-4e1c-859a-53b06ff18ec8        random_mv_prod-12           SUCCEEDED
c9b01618-a20a-4794-96aa-7b6900d39a43        random_mv_prod-18           SUCCEEDED
ae852c96-cc1b-48fe-911e-f1d7197ff341        random_mv_prod-0            SUCCEEDED
ca0f4589-e5bc-42b6-af5a-119c46644c24        random_mv_prod-15           SUCCEEDED
a8a941fe-12bf-4ac7-a4f7-70e84c577321        random_mv_prod-3            SUCCEEDED
0e7af839-3a91-49c2-b8d4-3d990f697b36        random_mv_prod-6            SUCCEEDED
acbdf052-dcc8-4df1-9095-2ed4f52e66dc        random_mv_prod-10           SUCCEEDED
2c063597-36de-4273-b074-6e89d61e48e5        random_mv_prod-16           SUCCEEDED
95857b21-ed97-4fcc-9fbc-66069906a9c3        random_mv_prod-2            SUCCEEDED
cdd72224-6368-4a71-8bdd-d732212c67b7        random_mv_prod-7            SUCCEEDED
d07f1371-53c7-47d1-a84e-2ad9caffcf10        random_mv_prod-17           SUCCEEDED
aa82fe66-dabb-49ca-a758-f9de7657c287        random_mv_prod-8            SUCCEEDED
700d9759-8515-4ca9-b059-7b92f44bade8        random_mv_prod-4            SUCCEEDED
7b31aa16-8404-47e0-9e8f-62ebffe5b6f9        random_mv_prod-5            SUCCEEDED
b873b0b5-4bdc-46b4-91f5-b6e6266f56b2        random_mv_prod-19           SUCCEEDED
ff34fc24-c1b2-4b16-a2c4-802a97693129        random_mv_prod-13           SUCCEEDED
50f7a005-d16f-4bdc-9ed4-468e3d67fccf        random_mv_prod-14           SUCCEEDED
6f7f2747-06ba-435f-927d-7379a2de824c        random_mv_prod-1            SUCCEEDED
8b415268-309b-4a41-a7b8-270c1ad70d03        random_mv_prod-9            SUCCEEDED
710cddc4-894f-4b18-8f1f-59e1b1656f40        random_mv_prod-11           SUCCEEDED

We can also inspect each BatchJob instance by looking at knot.jobs which returns a list of BatchJob instances for each submitted job, e.g.:

In [16]:
last_job = knot.jobs[-1]
In [8]:
print(last_job.done)
print(last_job.result(timeout=5))
False
---------------------------------------------------------------------------
CKTimeoutError                            Traceback (most recent call last)
<ipython-input-8-eedc99836a06> in <module>()
      1 print(last_job.done)
----> 2 print(last_job.result(timeout=5))

/Users/Adam/code/projects/cloudknot/cloudknot/aws/batch.py in result(self, timeout)
   1811 
   1812         if not self.done:
-> 1813             raise CKTimeoutError(self.job_id)
   1814 
   1815         status = self.status

CKTimeoutError: The job with job-id b873b0b5-4bdc-46b4-91f5-b6e6266f56b2 did not finish within the requested timeout period

Knot.map() returns a list of futures so you can use any of the futures methods to query the results, e.g. done() or result().

In [17]:
print(result_futures[0].done())
print(result_futures[0].result())
True
[ 0.1824538   0.13255699 -0.32016405 ..., -0.48144142  0.18720769
  0.09038733]

Once you're all done, clobber the knot, including the underlying PARS and the remote repo.

In [19]:
knot.clobber(clobber_pars=True, clobber_repo=True, clobber_image=True)
WARNING:cloudknot.aws.ec2:Deleted dependent EC2 instances: ['i-0bdc8d35ac1cbbd43']