The DataSet tutorial covered the basics of how to use DataSet
objects with time-independent counts. When your data is time-stamped, either for each individual count or by groups of counts, there are additional (richer) options for analysis. The DataSet
class is also capable of storing time-dependent data by holding series of count data rather than binned numbers-of-counts, which are added via its add_series_data
method. Outcome counts are input by giving at least two parallel arrays of 1) outcome labels and 2) time stamps. Optionally, one can provide a third array of repetitions, specifying how many times the corresponding outcome occurred at the time stamp. While in reality no two outcomes are taken at exactly the same time, a DataSet
allows for arbitrarily coarse-grained time-dependent data in which multiple outcomes are all tagged with the same time stamp. In fact, the "time-independent" case considered in the aforementioned tutorial is actually just a special case in which the all data is stamped at time=0.
Below we demonstrate how to create and initialize a DataSet
using time series data.
import pygsti
#Create an empty dataset
tdds = pygsti.objects.DataSet(outcomeLabels=['0','1'])
#Add a "single-shot" series of outcomes, where each spam label (outcome) has a separate time stamp
tdds.add_raw_series_data( ('Gx',), #gate sequence
['0','0','1','0','1','0','1','1','1','0'], #spam labels
[0.0, 0.2, 0.5, 0.6, 0.7, 0.9, 1.1, 1.3, 1.35, 1.5]) #time stamps
#When adding outcome-counts in "chunks" where the counts of each
# chunk occur at nominally the same time, use 'add_raw_series_data' to
# add a list of count dictionaries with a timestamp given for each dict:
tdds.add_series_data( ('Gx','Gx'), #gate sequence
[{'0':10, '1':90}, {'0':30, '1':70}], #count dicts
[0.0, 1.0]) #time stamps - one per dictionary
#For even more control, you can specify the timestamp of each count
# event or group of identical outcomes that occur at the same time:
#Add 3 'plus' outcomes at time 0.0, followed by 2 'minus' outcomes at time 1.0
tdds.add_raw_series_data( ('Gy',), #gate sequence
['0','1'], #spam labels
[0.0, 1.0], #time stamps
[3,2]) #repeats
#The above coarse-grained addition is logically identical to:
# tdds.add_raw_series_data( ('Gy',), #gate sequence
# ['0','0','0','1','1'], #spam labels
# [0.0, 0.0, 0.0, 1.0, 1.0]) #time stamps
# (However, the DataSet will store the coase-grained addition more efficiently.)
When one is done populating the DataSet
with data, one should still call done_adding_data
:
tdds.done_adding_data()
Access to the underlying time series data is done by indexing on the gate sequence (to get a DataSetRow
object, just as in the time-independent case) which has various methods for retrieving its underlying data:
tdds_row = tdds[('Gx',)]
print("INFO for Gx string:\n")
print( tdds_row )
print( "Raw outcome label indices:", tdds_row.oli )
print( "Raw time stamps:", tdds_row.time )
print( "Raw repetitions:", tdds_row.reps )
print( "Number of entries in raw arrays:", len(tdds_row) )
print( "Outcome Labels:", tdds_row.outcomes )
print( "Repetition-expanded outcome labels:", tdds_row.get_expanded_ol() )
print( "Repetition-expanded outcome label indices:", tdds_row.get_expanded_oli() )
print( "Repetition-expanded time stamps:", tdds_row.get_expanded_times() )
print( "Time-independent-like counts per spam label:", tdds_row.counts )
print( "Time-independent-like total counts:", tdds_row.total )
print( "Time-independent-like spam label fraction:", tdds_row.fractions )
print("\n")
tdds_row = tdds[('Gy',)]
print("INFO for Gy string:\n")
print( tdds_row )
print( "Raw outcome label indices:", tdds_row.oli )
print( "Raw time stamps:", tdds_row.time )
print( "Raw repetitions:", tdds_row.reps )
print( "Number of entries in raw arrays:", len(tdds_row) )
print( "Spam Labels:", tdds_row.outcomes )
print( "Repetition-expanded outcome labels:", tdds_row.get_expanded_ol() )
print( "Repetition-expanded outcome label indices:", tdds_row.get_expanded_oli() )
print( "Repetition-expanded time stamps:", tdds_row.get_expanded_times() )
print( "Time-independent-like counts per spam label:", tdds_row.counts )
print( "Time-independent-like total counts:", tdds_row.total )
print( "Time-independent-like spam label fraction:", tdds_row.fractions )
INFO for Gx string: Outcome Label Indices = [0 0 1 0 1 0 1 1 1 0] Time stamps = [0. 0.2 0.5 0.6 0.7 0.9 1.1 1.3 1.35 1.5 ] Repetitions = [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] Raw outcome label indices: [0 0 1 0 1 0 1 1 1 0] Raw time stamps: [0. 0.2 0.5 0.6 0.7 0.9 1.1 1.3 1.35 1.5 ] Raw repetitions: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] Number of entries in raw arrays: 10 Outcome Labels: [('0',), ('0',), ('1',), ('0',), ('1',), ('0',), ('1',), ('1',), ('1',), ('0',)] Repetition-expanded outcome labels: [('0',), ('0',), ('1',), ('0',), ('1',), ('0',), ('1',), ('1',), ('1',), ('0',)] Repetition-expanded outcome label indices: [0 0 1 0 1 0 1 1 1 0] Repetition-expanded time stamps: [0. 0.2 0.5 0.6 0.7 0.9 1.1 1.3 1.35 1.5 ] Time-independent-like counts per spam label: OutcomeLabelDict([(('0',), 5.0), (('1',), 5.0)]) Time-independent-like total counts: 10.0 Time-independent-like spam label fraction: OrderedDict([(('0',), 1.0)]) INFO for Gy string: Outcome Label Indices = [0 1] Time stamps = [0. 1.] Repetitions = [3. 2.] Raw outcome label indices: [0 1] Raw time stamps: [0. 1.] Raw repetitions: [3. 2.] Number of entries in raw arrays: 2 Spam Labels: [('0',), ('1',)] Repetition-expanded outcome labels: [('0',), ('0',), ('0',), ('1',), ('1',)] Repetition-expanded outcome label indices: [0 0 0 1 1] Repetition-expanded time stamps: [0. 0. 0. 1. 1.] Time-independent-like counts per spam label: OutcomeLabelDict([(('0',), 3.0), (('1',), 2.0)]) Time-independent-like total counts: 5.0 Time-independent-like spam label fraction: OrderedDict([(('0',), 1.0)])
Finally, it is possible to read text-formatted time-dependent data in the special case when
This corresponds to the case when each sequence is performed and measured simultaneously at equally spaced intervals. We realize this is a bit fictitous and more text-format input options will be created in the future.
tddataset_txt = \
"""## 0 = 0
## 1 = 1
{} 011001
Gx 111000111
Gy 11001100
"""
with open("../../tutorial_files/TDDataset.txt","w") as output:
output.write(tddataset_txt)
tdds_fromfile = pygsti.io.load_tddataset("../../tutorial_files/TDDataset.txt")
print(tdds_fromfile)
print("Some tests:")
print(tdds_fromfile[()].fraction('1'))
print(tdds_fromfile[('Gy',)].fraction('1'))
print(tdds_fromfile[('Gx',)].total)
Loading ../../tutorial_files/TDDataset.txt: 100% Dataset outcomes: OrderedDict([(('0',), 0), (('1',), 1)]) {} : Outcome Label Indices = [0 1 1 0 0 1] Time stamps = [0. 1. 2. 3. 4. 5.] ( no repetitions ) Gx : Outcome Label Indices = [1 1 1 0 0 0 1 1 1] Time stamps = [0. 1. 2. 3. 4. 5. 6. 7. 8.] ( no repetitions ) Gy : Outcome Label Indices = [1 1 0 0 1 1 0 0] Time stamps = [0. 1. 2. 3. 4. 5. 6. 7.] ( no repetitions ) Some tests: 0.5 0.5 9.0