import traceback
This notebook showcases the new API. Inlined are some comments on design decisions.
import vistrails as vt
There are few reasons for the API not to be under the top-level vistrails
package. They are:
Vistrail
s and Pipeline
s¶Vistrail
s objects are currently obtained through load_vistrail()
, although they can be constructed from an existing Pipeline
or VistrailController
internal.
Question: should we allow Vistrail('path/to/file') as well?
vistrail = vt.load_vistrail('examples/simplemath.vt')
A Vistrail
is basically a controller. From it we can get Pipeline
s, but it is also stateful (i.e. has a current version); this is useful for editing (creating new versions from the current one). It also provides the interface that Pipeline
has, implicitely acting on the current_pipeline
.
Problem: there are issues with upgrades;
get_pipeline()
can return non-upgraded pipelines and this is bad.vistrail.get_pipeline(vistrail.current_version)
will return the non-upgraded thing which is unexpected.
vistrail
<Vistrail: simplemath.vt, version -1, not changed>
vistrail.select_latest_version()
vistrail
<Vistrail: simplemath.vt, version 28, not changed>
vistrail.get_pipeline(2)
<Pipeline: 1 modules, 0 connections>
Only basic_modules
(and abstractions
?) are loaded on initialization, so that using the API stays fast. A package might be auto-enabled when it is requested, which is efficient and convenient.
load_package()
only uses package identifiers (although we could add versions specifiers?), I don't think we want to worry about names/codepaths.
tabledata = vt.load_package('org.vistrails.vistrails.tabledata')
tabledata
<Package: org.vistrails.vistrails.tabledata, 23 modules>
You can get Module
s from the package using the dot or bracket syntax. These modules are "dangling" modules, not yet instanciated in a specific pipeline/vistrail.
I chose not to make a distinction between module descriptors and pipeline modules (module descriptors are just modules that are not yet connected to a pipeline) to simplify things and keep the number of concepts low.
tabledata.convert
<Namespace convert of package org.vistrails.vistrails.tabledata>
from vistrails.core.modules.module_registry import MissingModule
try:
tabledata['convert'] # can't get namespaces this way, use a dot
except MissingModule:
pass
else:
assert False
tabledata.BuildTable, tabledata['BuildTable']
(vistrails.core.api.BuildTable, vistrails.core.api.BuildTable)
tabledata.read.CSVFile, tabledata['read|CSVFile']
(vistrails.core.api.CSVFile, vistrails.core.api.CSVFile)
(note: IPython bug 6709 causes the 'vistrails.core.api.
' prefixes above)
Work in progress...
In addition to executing a Pipeline
or Vistrail
, I want to be able to easily pass values in on InputPort modules (to use subworkflows as Python functions) and get results out (either on OutputPort modules or any port of any module).
Execution returns a Results
object from which you can get all of this, and that would be integrated with IPython to inline images and objects that support it (matplotlib, ...).
outputs = vt.load_vistrail('examples/outputs.vt')
outputs.select_version(1)
outputs
<Vistrail: outputs.vt, version 1, not changed>
# Errors
try:
result = outputs.execute()
except vt.ExecutionErrors:
traceback.print_exc()
else:
assert False
Traceback (most recent call last): File "<ipython-input-14-979bf6416e43>", line 3, in <module> result = outputs.execute() File "vistrails\core\api.py", line 205, in execute return self.current_pipeline.execute(*args, **kwargs) File "vistrails\core\api.py", line 424, in execute raise ExecutionErrors(self, result) ExecutionErrors: Pipeline execution failed: 1 error: 0: Missing value from port value
# Results
outputs.select_latest_version()
result = outputs.execute()
result
<ExecutionResult: 2 modules>
outputs
<Vistrail: outputs.vt, version 5, changed>
outputs.current_pipeline
<Pipeline: 2 modules, 1 connections; outputs: msg>
result.module_output(0)
{'self': <vistrails.core.modules.basic_modules.String at 0x657b8b0>, 'value': 'Hello, world', 'value_as_string': 'Hello, world'}
result.output_port('msg')
'Hello, world'
pipeline = vistrail.current_pipeline
pipeline
<Pipeline: 6 modules, 6 connections; inputs: in_a, in_b; outputs: out_times, out_plus>
in_a = pipeline.get_input('in_a')
assert (in_a == pipeline.get_module('First input')) is True
in_a
<Module 'InputPort' from org.vistrails.vistrails.basic, id 1, label "First input">
result = pipeline.execute(in_a == 2, in_b=4)
result.output_port('out_times'), result.output_port('out_plus')
(8.0, 6.0)
im = vt.load_vistrail('examples/imagemagick.vt')
im.select_version('read')
im
<Vistrail: imagemagick.vt, version 9 (tag read), not changed>
im.execute().output_port('result')
im.select_version('blur')
im
<Vistrail: imagemagick.vt, version 16 (tag blur), changed>
im.execute().output_port('result')
im.select_version('edges')
im.execute().output_port('result')