ISB-CGC Community Notebooks

Title:   How to use a workflow execution service (WES)
Author:  David L Gibbs
Created: 2019-11-16
Purpose: Introduction to using a workflow execution service, GA4GH style
Repo: https://github.com/isb-cgc/Community-Notebooks/blob/master/Notebooks/How_to_use_a_GA4GH_tool_using_WES.ipynb
Notes: Does not work on google colabs.

How to use a WES service

This notebook is designed to be a quick introduction to using a workflow execution service (WES) and is intended as a follow-up to a previous notebook on searching for tools using a tool registry service (TRS; How to find a tool using TRS here ). This notebook must be run in an environment capable of running docker. Google Colab notebooks will be exteremely difficult to use. It's advised that a Jupyter-lab environment is started using the Google Cloud Console, AI platform.

Software used:

wes-service, a client and server implementation of the GA4GH Workflow Execution Service 1.0.0 API.

https://github.com/common-workflow-language/workflow-service https://pypi.org/project/wes-service/

cwl-tool, Common Workflow Language tool description reference implementation https://github.com/common-workflow-language/cwltool

In [ ]:
import subprocess as sp
In [ ]:
!sudo pip install wes-service
!sudo pip install cwltool
!sudo pip install cwlref-runner
In [ ]:
!wes-client --version

We're going to use the subprocess library to start the wes-server in the background. We submit jobs to the wes-server, which in turn runs them on a backend executor, here, cwltool.

In [ ]:
sp.Popen( ['wes-server', '--port', '8885'] )
In [5]:
# check for jobs... not yet!

!wes-client --host localhost:8885 --proto http --list

Let's get some workflow test files to use...

In [ ]:
!git clone https://github.com/common-workflow-language/workflow-service
In [2]:
cd workflow-service/testdata/
In [7]:
ls -lha

Now, let's use the TRS to search for a tool called 'md5sum'.

In [ ]:
import requests
response = requests.get('https://dockstore.org:8443/api/ga4gh/v1/tools/', params={"name": "md5sum"})
n = (len(response.json()[0]['versions'][0])) - 2  # n was just 0 for version 1.0.0
md5sum_url = response.json()[0]['versions'][n]['url'] + '/plain-CWL/descriptor/%2FDockstore.cwl'
In [ ]:
# we have a url to the CWL.

print(md5sum_url)

To submit a job to the wes-server, we use the wes-client, and pass the URL and a small json file that describes the input. It's surprisingly easy.

Here's the file describing the input:

In [ ]:
cat md5sum.json

Now we'll submit the job.

In [ ]:
!wes-client --host=localhost:8885 --proto http  $md5sum_url  md5sum.json

And we'll view the output...

In [ ]:
!cat /home/jupyter/workflows/48ea8e524ae848b58bcead5eaae35052/outdir/md5sum.txt

Let's compare that result to simply running md5sum.

In [ ]:
!md5sum md5sum.input

Confirmed!

Now, what's in that URL?

In [ ]:
!curl https://dockstore.org/api/api/ga4gh/v1/tools/quay.io%2Fbriandoconnor%2Fdockstore-tool-md5sum/versions/master/plain-CWL/descriptor/%2FDockstore.cwl

Now, for comparison's sake, we'll execute the workflow using the command given on the github readme: https://github.com/common-workflow-language/workflow-service

In [ ]:
!wes-client --host localhost:8885 --proto http --attachments="dockstore-tool-md5sum.cwl,md5sum.input" md5sum.cwl md5sum.cwl.json
In [ ]:
cat /home/jupyter/workflows/0fc518dfd1fd480999315ec9499e6f69/outdir/md5sum.txt
In [ ]:
!md5sum md5sum.input

CONFIRMED !

Using the wes-service was actually fairly easy, provided you have a nice CWL tool description!

Please let us know if you have any questions!