Let Us Do the Bookkeeping For You

In this notebook you will:

  • Run some simulated experiments and then access the metadata about them.
  • Use that metadata to generate a summary report.
  • Use it filter search results.
  • Explore some of Python's string formatting features in detail.

Configuration

Below, we will connect to EPICS IOC(s) controlling simulated hardware in lieu of actual motors, detectors. The IOCs should already be running in the background. Run this command to verify that they are running: it should produce output with RUNNING on each line. In the event of a problem, edit this command to replace status with restart all and run again.

In [ ]:
!supervisorctl -c supervisor/supervisord.conf status
In [ ]:
%run scripts/beamline_configuration.py

Generate a Summary Report

Acquire some data like so. The details of what we are doing here are not important for what follows. If you want to know more about data acquisition, start with Hello Bluesky.

In [ ]:
RE(count([ph]))
RE(count([ph, edge, slit], 3))
RE(scan([edge], motor_edge, -10, 10, 15))
RE(scan([edge], motor_edge, -1, 3, 5))
RE(scan([ph], motor_ph, -1, 3, 5))
RE(scan([slit], motor_slit, -10, 10, 15))

Here a some code that prints a summary with some of the metadata automatically captured by Bluesky. Note the time filter added to the databroker object (db) - see Filtering for more about this feature.

In [ ]:
import time
from datetime import datetime

now = time.time()
an_hour_ago = now - 60 * 60 *24
print("HH:MM  plan_name  detectors      motors")
for h in db(since=an_hour_ago):
    md = h.start
    print(f"{datetime.fromtimestamp(md['time']):%H:%M}  "
          f"{md['plan_name']:11}"
          f"{','.join(md.get('detectors', [])):15}"
          f"{','.join(md.get('motors', [])):15}")

Let's add one more example data, a run that failed because of a user error.

In [ ]:
# THIS IS EXPECTED TO CREATE AN ERROR.

RE(scan([motor_ph], ph, -1, 1, 3))  # oops I tried to use a detector as a motor

We'll add one more column to extract the 'exit_status' reported by RE before it errored out.

In [ ]:
print("HH:MM  plan_name  detectors      motors         exit_status")
for h in db(since=an_hour_ago):
    md = h.start
    print(f"{datetime.fromtimestamp(md['time']):%H:%M}  "
          f"{md['plan_name']:11}"
          f"{','.join(md.get('detectors', [])):15}"
          f"{','.join(md.get('motors', [])):15}"
          f"{h.stop['exit_status']}")

Let's make it easier to reuse this code block by formulating it as a function.

In [ ]:
def summarize_runs(headers):
    print("HH:MM  plan_name  detectors      motors         exit_status")
    for h in headers:
        md = h.start
        print(f"{datetime.fromtimestamp(md['time']):%H:%M}  "
              f"{md['plan_name']:11}"
              f"{','.join(md.get('detectors', [])):15}"
              f"{','.join(md.get('motors', [])):15}"
              f"{h.stop['exit_status']}")
        
        
summarize_runs(db(since=an_hour_ago))

Getting a little fancy (Too fancy? Maybe....) you can print by default but optionally write to a text file instead.

In [ ]:
import functools

def summarize_runs(headers, write=functools.partial(print, end='')):
    write("HH:MM  plan_name  detectors      motors         exit_status\n")
    for h in headers:
        md = h.start
        write(f"{datetime.fromtimestamp(md['time']):%H:%M}  "
              f"{md['plan_name']:11}"
              f"{','.join(md.get('detectors', [])):15}"
              f"{','.join(md.get('motors', [])):15}"
              f"{h.stop['exit_status']}\n")
        
        
summarize_runs(db(since=an_hour_ago))  # prints as before
In [ ]:
summarize_runs(db(since=an_hour_ago), write=open('summary.txt', 'w').write)  # writes to 'summary.txt'
In [ ]:
# cat is a UNIX command for reading a text file. We could also just go open the file like normal people.
!cat summary.txt

If the user tells us more, our report can get richer

In [ ]:
RE.md['operator'] = 'Dan'

This applies to all future runs until deleted:

del RE.md['operator']

replaced

RE.md['operator'] = 'Tom'

or superceded

RE(count([ph]), operator='Maksim')

In that last example, 'Maksim' takes precedence over whatever is in RE.md, but just for this execution. If next we did

RE(count([ph]))

the operator would revert back to 'Tom'.

In [ ]:
# User reports the run's "purpose". (That isn't a special name... you can use any terms you want here...)
RE(count([ph]), purpose='test')
RE(count([ph, edge, slit], 3), purpose='test')
RE(scan([edge], motor_edge, -10, 10, 15), purpose='find edge')
RE(scan([edge], motor_edge, -1, 3, 5), purpose='find edge')
RE.md['operator'] = 'Tom'  # Tom takes over.
RE(scan([ph], motor_ph, -1, 3, 5), purpose='data')
RE(scan([slit], motor_slit, -10, 10, 15), purpose='data')
In [ ]:
def summarize_runs(headers):
    print("HH:MM  plan_name  detectors      motors         exit_status    purpose")
    for h in headers:
        md = h.start
        print(f"{datetime.fromtimestamp(md['time']):%H:%M}  "
              f"{md['plan_name']:11}"
              f"{','.join(md.get('detectors', [])):15}"
              f"{','.join(md.get('motors', [])):15}"
              f"{h.stop['exit_status']:15}"
              f"{md.get('purpose', '?')}")
        
        
summarize_runs(db(since=an_hour_ago))

Filtering

We have been filtering based on time. We can filter on user-provided metadata like purpose or automatically-captured metadata like detectors. And we can apply multiple filters at the same time.

In [ ]:
summarize_runs(db(since=an_hour_ago, purpose='data'))
In [ ]:
summarize_runs(db(since=an_hour_ago, detectors='ph', purpose='test'))

There is a rich query language available here; we are just exercising the basics.

What about getting the data itself?

Wait for the next notebook!

So what's happening inside print(...)?

A couple handy Python concepts you might not have encountered before...

"f-strings" (new Python 3.6!)

In [ ]:
name = "Dan"
age = 32

print("Hello my name is {name} and I am {age}.")

Add an f before the quote and it becomes a magical "f-string"!

In [ ]:
print(f"Hello my name is {name} and I am {age}.")

You can put code inside the {}s.

In [ ]:
print(f"Hello my name is {name} and next year I will be {1 + age}.")

Dictionary lookup with defaults

Recall basic dictionary manipulations:

In [ ]:
md = dict(plan_name='count', detectors=['ph', 'edge'], time=now)
In [ ]:
md['detectors']  # Look up the value for the 'detectors' key in the md dictionary.
In [ ]:
md['purpose']  # The user never specified a 'purpose' here, so this raises a KeyError.
In [ ]:
md.get('purpose', '?')  # Falls back to a default instead of erroring out.

list -> comma-separated string

In [ ]:
md.get('detectors', [])
In [ ]:
', '.join(md.get('detectors', []))
In [ ]:
', '.join(md.get('motors', []))  # Remember motors isn't set, so this falls back to the default, an empty list.

time-munging

In [ ]:
md['time']  # seconds since 1970, the conventional "UNIX epoch"
In [ ]:
datetime.fromtimestamp(md['time'])  # year, month, date, hour, minute, second, microseconds

putting it all together...

In [ ]:
def summarize_runs(headers):
    print("HH:MM  plan_name  detectors      motors         exit_status")
    for h in headers:
        md = h.start
        print(f"{datetime.fromtimestamp(md['time'])}  "
              f"{md['plan_name']}"
              f"{','.join(md.get('detectors', []))}"
              f"{','.join(md.get('motors', []))}"
              f"{h.stop['exit_status']}")
In [ ]:
summarize_runs(db(since=an_hour_ago))

finishing touch: white space

Use :N to fix width at N characters.

In [ ]:
f"Hello my name is {name:10} is I am {age:5}."

Format the time.

In [ ]:
f"{datetime.fromtimestamp(md['time']):%H:%M}"
In [ ]:
def summarize_runs(headers):
    print("HH:MM  plan_name  detectors      motors         exit_status")
    for h in headers:
        md = h.start
        print(f"{datetime.fromtimestamp(md['time']):%H:%M}  "
              f"{md['plan_name']:11}"
              f"{','.join(md.get('detectors', [])):15}"
              f"{','.join(md.get('motors', [])):15}"
              f"{h.stop['exit_status']:11}")
In [ ]:
summarize_runs(db(since=an_hour_ago))

Exercise

Q1. Add an 'operator' column. Hint: Remember that 'operator' was reported by the user in some of our example data above, but the name has no special significance to Bluesky and is not guaranteed to be reported. To avoid erroring when it is not reported, you will need to use md.get(...) instead of md[...].

In [ ]:
# Type your answer here. We have pasted in the latest version of summarize_runs to start from.

def summarize_runs(headers):
    print("HH:MM  plan_name  detectors      motors         exit_status")
    for h in headers:
        md = h.start
        print(f"{datetime.fromtimestamp(md['time']):%H:%M}  "
              f"{md['plan_name']:11}"
              f"{','.join(md.get('detectors', [])):15}"
              f"{','.join(md.get('motors', [])):15}"
              f"{h.stop['exit_status']:15}")
In [ ]:
%load solutions/summarize_runs_with_operator.py

Q2. Print a table with results filtered by operator, just as we filtered results by purpose.

In [ ]:
# Fill in the blank
# summarize_runs(db(_____))
In [ ]:
%load solutions/filter_runs_by_operator.py

Q3. Add seconds to the time columnn.

In [ ]:
# Type your answer here. We have pasted in the latest version of summarize_runs to start from.

def summarize_runs(headers):
    print("HH:MM  plan_name  detectors      motors         exit_status")
    for h in headers:
        md = h.start
        print(f"{datetime.fromtimestamp(md['time']):%H:%M}  "
              f"{md['plan_name']:11}"
              f"{','.join(md.get('detectors', [])):15}"
              f"{','.join(md.get('motors', [])):15}"
              f"{h.stop['exit_status']:15}")
In [ ]:
%load solutions/summarize_runs_with_seconds.py

Q4. The md['uid'] is the guaranteed unique identifier for a run. It's unweildy to print in its entirely. Print just the first 8 characters. (For practical purposes, this is sufficently unique.)

Hint: String truncation in Python works like this:

'supercalifragilisticexpialidocious'[:8] == 'supercal'
In [ ]:
# Type your answer here. We have pasted in the latest version of summarize_runs to start from.

def summarize_runs(headers):
    print("HH:MM  plan_name  detectors      motors         exit_status")
    for h in headers:
        md = h.start
        print(f"{datetime.fromtimestamp(md['time']):%H:%M}  "
              f"{md['plan_name']:11}"
              f"{','.join(md.get('detectors', [])):15}"
              f"{','.join(md.get('motors', [])):15}"
              f"{h.stop['exit_status']:15}")
In [ ]:
%load solutions/summarize_runs_with_uid.py

Q5. In all our examples, the columns are left-justified. The format specification mini language documents how to right-justify or center the text. Right-justify the exit_status column. This is a bit of a contrived example, but the feature is more useful when the column have numerical data.

In [ ]:
# Type your answer here. We have pasted in the latest version of summarize_runs to start from.

def summarize_runs(headers):
    print("HH:MM  plan_name  detectors      motors         exit_status")
    for h in headers:
        md = h.start
        print(f"{datetime.fromtimestamp(md['time']):%H:%M}  "
              f"{md['plan_name']:11}"
              f"{','.join(md.get('detectors', [])):15}"
              f"{','.join(md.get('motors', [])):15}"
              f"{h.stop['exit_status']:15}")
In [ ]:
%load solutions/summarize_runs_right_justify_exit_status.py