Notebook

Algorithms: low-level interface¶

Once we have data for GST, there are several algorithms we can run on it to produce tomographic estimates. Depending on the amount of data you have, and time available for running LinearOperator Set Tomography, one algorithm may be preferable over the others. What is typically thought of as "standard GST" is the iterative maximum-likelihood optimization implemented by run_iterative_gst which optimizes one or more objective functions iteratively, meaning it daisy-chains multiple optimizations, changing (typically adding to) the circuits under consideration or the objective function itself (e.g. from $\chi^2$ to the log-likelihood).

pygsti contains two main low-level algorithms that form the base of the GST protocol(s):

Linear gate set tomography (LGST): Uses short operation sequences to quickly compute a rough (low accuracy) estimate of a model by linear inversion. The model estimate produced has a very specific form: it holds the dense process matrices of each "gate" (or circuit layer) under consideration.
Long-sequence gate set tomography (LSGST): Minimizes an objective function that quantifies a badness-of-fit between a model and data set. Here "sequence" is synonymous with "circuit". The optimization is performed in stages, using successively larger sets of longer and longer circuits.

If you're curious, the implementation of these algorithms are found in the pygsti.algorithms.core module. In this tutorial, we'll show how to invoke each of these low-level algorithms directly. This is probably only useful when building your own protocol that needs to borrow pieces of GST without borrowing the entire protocol.

Additionally, pygsti contains gauge-optimization algorithms. Because the outcome data (the input to the GST algorithms above) only determines a model up to some number of un-physical "gauge" degrees of freedom, it is often desirable to optimize the Model estimate obtained from a GST algorithm within the space of its gauge freedoms. This process is called "gauge-optimization" and the final part of this tutorial demonstrates how to gauge-optimize a model using various criteria.

Low-level GST Algorithms¶

The ingredients needed by the LGST and LSGST algorithms are:

a "target" Model which defines the desired gates. This model is used by LGST to specify the various gate, state preparation, POVM effect, and SPAM labels, as well as to provide an initial guess for the gauge degrees of freedom.
a DataSet containing the data that GST attempts to fit using the probabilities generated by a single Model. This data set must at least contain the data for the circuits required by the algorithm that is chosen.
for LSGST, a list-of-lists of Circuit objects that specify which circuits are used during each iteration of the algorithm (the length of the top-level list defines the number of interations).

In [1]:

import pygsti
import json

In [2]:

#We'll use the standard I, X(pi/2), Y(pi/2) model that we generated data for in the DataSet tutorial
from pygsti.modelpacks import smq1Q_XYI

target_model = smq1Q_XYI.target_model()
prep_fiducials = smq1Q_XYI.prep_fiducials()
meas_fiducials = smq1Q_XYI.meas_fiducials()
germs = smq1Q_XYI.germs()

maxLengthList = [1,2,4,8,16] #for use in iterative algorithms

ds = pygsti.io.load_dataset("../../tutorial_files/Example_Dataset.txt", cache=True)
dsLowCounts = pygsti.io.load_dataset("../../tutorial_files/Example_Dataset_LowCnts.txt", cache=True)

depol_model = target_model.depolarize(op_noise=0.1)

print("Loaded target model with operation labels: ", target_model.operations.keys())
print("Loaded fiducial lists of lengths: ", (len(prep_fiducials),len(meas_fiducials)))
print("Loaded dataset of length: ", len(ds))

Reading ../../tutorial_files/Example_Dataset_LowCnts.txt: 100%
Writing cache file (to speed future loads): ../../tutorial_files/Example_Dataset_LowCnts.txt.cache
Loaded target model with operation labels:  odict_keys([Label(()), Label(('Gxpi2', 0)), Label(('Gypi2', 0))])
Loaded fiducial lists of lengths:  (6, 6)
Loaded dataset of length:  1120

C:\Users\ciostro\Documents\pyGSTi_random_bugfixes\pygsti\tools\legacytools.py:38: UserWarning: The function save is deprecated, and may not be present in future versions of pygsti.
    Please use write_binary instead.
  _warnings.warn(message)

Using LGST to get an initial estimate¶

An important and distinguising property of the LGST algorithm is that it does not require an initial-guess Model as an input. It uses linear inversion and short sequences to obtain a rough model estimate. As such, it is very common to use the LGST estimate as the initial-guess starting point for more advanced forms of GST.

In [3]:

#Run LGST to get an initial estimate for the gates in target_model based on the data in ds

#run LGST
mdl_lgst = pygsti.run_lgst(ds, prep_fiducials, meas_fiducials, target_model=target_model, verbosity=1)

#Gauge optimize the result to match the target model
mdl_lgst_after_gauge_opt = pygsti.gaugeopt_to_target(mdl_lgst, target_model, verbosity=1)

#Contract the result to CPTP, guaranteeing that the gates are CPTP
mdl_clgst = pygsti.contract(mdl_lgst_after_gauge_opt, "CPTP", verbosity=1)

--- LGST ---
Gauge optimization completed in 0.0500555s.
--- Contract to CPTP (direct) ---
The closest legal point found was distance: 0.0002851261395880569

In [4]:

print(mdl_lgst)

rho0 = FullState with dimension 4
 0.71-0.02 0.03 0.75


Mdefault = UnconstrainedPOVM with effect vectors:
0: FullPOVMEffect with dimension 4
 0.73   0   0 0.65

1: FullPOVMEffect with dimension 4
 0.69   0   0-0.65



[] = 
FullArbitraryOp with shape (4, 4)
 1.00   0   0   0
 0.02 0.88 0.03-0.05
-0.03-0.03 0.88 0.02
   0   0   0 0.90


Gxpi2:0 = 
FullArbitraryOp with shape (4, 4)
 1.00   0   0   0
   0 0.90 0.03   0
-0.02-0.03-0.03-0.99
-0.06 0.03 0.81   0


Gypi2:0 = 
FullArbitraryOp with shape (4, 4)
 1.00   0   0   0
   0   0-0.05 1.00
 0.02   0 0.89   0
-0.05-0.80   0   0

Long-sequence GST¶

Long-sequence GST operates by repeatedly fitting a model to a data set and using the result to seed a similar optimization that compares the data of more circuits. As such a list-of-lists of circuits, one per iteration, is required. The elements of these lists are typically repetitions of short "germ" circuits such that the final circuit does not exceed some maximum length (depth). We use the make_lsgst_lists function to create such lists below.

In [5]:

#Get lists of operation sequences for successive iterations of LSGST to use
lsgstListOfLists = pygsti.circuits.create_lsgst_circuit_lists(target_model, prep_fiducials, meas_fiducials, germs, maxLengthList)

A single iteration, whereby the parameters of a Model are adjusted to find the best fit to a DataSet, is performed by the run_gst_fit function. The "best fit" is defined as the model parameters which minimize some objective function. run_gst_fit takes as arguments an a ModelDatasetCircuitStore, which simply bundles together a Model, Dataset, and set of Circuits, as well as an objective function builder (an object that encapsulates an objective function but lacks some needed context, and can build an objective function given this context). For convience, the run_gst_fit_simple function abstracts away the ModelDatasetCircuitStore construction.

For example, the cell below finds the model parameters which minimize the log-likelihood computed over the first list of circuits.

In [6]:

opt_result, mdl_single_optimization = \
    pygsti.algorithms.run_gst_fit_simple(ds, mdl_clgst, lsgstListOfLists[0],
                                         optimizer=None, objective_function_builder="logl",
                                         resource_alloc=None, verbosity=1)

MatrixLayout: 1 processors divided into 1 x 1 (= 1) grid along circuit and parameter directions.
   1 atoms, parameter block size limits (None,)
*** Distributing 1 atoms to 1 atom-processing groups (1 cores) ***
    More atom-processors than hosts: each host gets ~1 atom-processors
    Atom-processors already occupy a single node, dividing atom-processor into 1 param-processors.
*** Divided 1-host atom-processor (~1 procs) into 1 param-processing groups ***
--- dlogl GST ---
2*Delta(log(L)) = 48.8152 (92 data params - 60 (approx) model params = expected mean of 32; p-value = 0.0288947)
Completed in 0.2s

The iterative algorithm, which essentially runs run_gst_fit multiple times, is implemented by run_iterative_gst. The number of iterations is set by the number of circuit lists provided as the 3rd argument. It takes two lists of objective function builders: the first, iteration_objfn_builders gives the objective functions to (sequentially) optimize on each iteration, while the second, final_objfn_builders gives additional objective functions to optimize on the final iteration.

In [7]:

models, opt_results, cache = \
    pygsti.algorithms.run_iterative_gst(ds, mdl_clgst, lsgstListOfLists,
                                       optimizer={'tol': 1e-5}, resource_alloc=None, verbosity=2,
                                       iteration_objfn_builders=['chi2'],
                                       final_objfn_builders=['logl'])
mdl_lsgst = models[-1]  # the model from the final iteration, i.e. the "final GST model"

--- Iterative GST: Iter 1 of 5  92 circuits ---: 
  MatrixLayout: 1 processors divided into 1 x 1 (= 1) grid along circuit and parameter directions.
     1 atoms, parameter block size limits (None,)
  *** Distributing 1 atoms to 1 atom-processing groups (1 cores) ***
      More atom-processors than hosts: each host gets ~1 atom-processors
      Atom-processors already occupy a single node, dividing atom-processor into 1 param-processors.
  *** Divided 1-host atom-processor (~1 procs) into 1 param-processing groups ***
  --- chi2 GST ---
  Sum of Chi^2 = 48.6364 (92 data params - 60 (approx) model params = expected mean of 32; p-value = 0.0300305)
  Completed in 0.1s
  Iteration 1 took 0.1s
  
--- Iterative GST: Iter 2 of 5  168 circuits ---: 
  MatrixLayout: 1 processors divided into 1 x 1 (= 1) grid along circuit and parameter directions.
     1 atoms, parameter block size limits (None,)
  *** Distributing 1 atoms to 1 atom-processing groups (1 cores) ***
      More atom-processors than hosts: each host gets ~1 atom-processors
      Atom-processors already occupy a single node, dividing atom-processor into 1 param-processors.
  *** Divided 1-host atom-processor (~1 procs) into 1 param-processing groups ***
  --- chi2 GST ---
  Sum of Chi^2 = 123.32 (168 data params - 60 (approx) model params = expected mean of 108; p-value = 0.14878)
  Completed in 0.2s
  Iteration 2 took 0.3s
  
--- Iterative GST: Iter 3 of 5  285 circuits ---: 
  MatrixLayout: 1 processors divided into 1 x 1 (= 1) grid along circuit and parameter directions.
     1 atoms, parameter block size limits (None,)
  *** Distributing 1 atoms to 1 atom-processing groups (1 cores) ***
      More atom-processors than hosts: each host gets ~1 atom-processors
      Atom-processors already occupy a single node, dividing atom-processor into 1 param-processors.
  *** Divided 1-host atom-processor (~1 procs) into 1 param-processing groups ***
  --- chi2 GST ---
  Sum of Chi^2 = 232.735 (285 data params - 60 (approx) model params = expected mean of 225; p-value = 0.347582)
  Completed in 0.2s
  Iteration 3 took 0.3s
  
--- Iterative GST: Iter 4 of 5  448 circuits ---: 
  MatrixLayout: 1 processors divided into 1 x 1 (= 1) grid along circuit and parameter directions.
     1 atoms, parameter block size limits (None,)
  *** Distributing 1 atoms to 1 atom-processing groups (1 cores) ***
      More atom-processors than hosts: each host gets ~1 atom-processors
      Atom-processors already occupy a single node, dividing atom-processor into 1 param-processors.
  *** Divided 1-host atom-processor (~1 procs) into 1 param-processing groups ***
  --- chi2 GST ---
  Sum of Chi^2 = 423.121 (448 data params - 60 (approx) model params = expected mean of 388; p-value = 0.105949)
  Completed in 0.4s
  Iteration 4 took 0.5s
  
--- Iterative GST: Iter 5 of 5  616 circuits ---: 
  MatrixLayout: 1 processors divided into 1 x 1 (= 1) grid along circuit and parameter directions.
     1 atoms, parameter block size limits (None,)
  *** Distributing 1 atoms to 1 atom-processing groups (1 cores) ***
      More atom-processors than hosts: each host gets ~1 atom-processors
      Atom-processors already occupy a single node, dividing atom-processor into 1 param-processors.
  *** Divided 1-host atom-processor (~1 procs) into 1 param-processing groups ***
  --- chi2 GST ---
  Sum of Chi^2 = 583.022 (616 data params - 60 (approx) model params = expected mean of 556; p-value = 0.20683)
  Completed in 0.5s
  Iteration 5 took 0.7s
  
  Last iteration:
  --- dlogl GST ---
  2*Delta(log(L)) = 583.536 (616 data params - 60 (approx) model params = expected mean of 556; p-value = 0.202591)
  Completed in 0.7s
  Final optimization took 0.7s
  
Iterative GST Total Time: 2.7s

Gauge Optimization¶

All gauge optimization algorithms perform essentially the same task - to find the model which optimizes some objective function from within the set or space of models that are gauge-equivalent to some starting set. This is accomplished in pygsti using the following mechanism:

one begins with an initial ExplicitOpModel, call it mdl, to be gauge-optimized.
a pygsti.objects.GaugeGroup instance defines a parameterized group of allowable gauge transformations. This gauge group must be compatible with the mdl's parameterization, so that mdl.transform_inplace (which calls LinearOperator.transform_inplace and SPAMVec.transform_inplace) is able to process elements of the GaugeGroup (obtained via a call to GaugeGroup.element(params)). That is, the gauge transformation must map between models with the same parameterization (that give by mdl). Because of the close interplay between a model's parameterization and its allowed gauge transformations, Model objects can contain a GaugeGroup instance as their default_gauge_group member. In many circumstances, mdl.default_gauge_group is set to the correct gauge group to use for a given Model.
pygsti.gaugeopt_custom(...) takes an intial ExplicitOpModel, an objective function, and a GaugeGroup (along with other optimization parameters) and returns a gauge-optimized ExplicitOpModel. Note that if its gauge_group argument is left as None, then the model's default gauge group is used. And objective function which takes a single ExplicitOpModel argument and returns a float can be supplied, giving the user a fair amount of flexiblity.
since usually the objective function is one which compares the model being optimized to a fixed "target" model, pygsti.gaugeopt_to_target(...) is a routine able to perform these common types of gauge optimization. Instead of an objective function, gaugeopt_to_target takes a target ExplicitOpModel and additional arguments (see below) from which it constructs a objective function and then calls gaugeopt_custom. It is essetially a convenience routine for constructing common gauge optimization objective functions. Relevant arguments which affect what objective function is used are:
- target_model : the ExplicitOpModel to compare against - i.e., the one you want to gauge optimize toward. Note that this doesn't have to be a set of ideal gates - it can be any (imperfect) model that reflects your expectations about what the estimates should look like.
- item_weights : a dictionary of weights allowing different gates and/or SPAM operations to be weighted differently when computing the objective function's value.
- CPpenalty : a prefactor multiplying the sum of all the negative Choi-matrix eigenvalues corresponding to each of the gates.
- TPpenalty : a prefactor multiplying the sum of absoulte-value differences between the first row of each operation matrix and [1 0 ... 0 ] and the discrpance between the first element of each state preparation vector and its expected value.
- validSpamPenalty : a prefactor multiplying penalty terms enforcing the non-negativity of state preparation eigenavlues and that POVM effect eigenvalues lie between 0 and 1.
- gates_metric : how to compare corresponding gates in the gauge-optimized and target sets. "frobenius" uses the frobenius norm (weighted before taking the final sqrt), "fidelity" uses the squared process infidelity (squared to avoid negative-infidelity issues in non-TP models), and "tracedist" uses the trace distance (weighted after computing the trace distance between corresponding gates).
- spamMetric : how to compare corresponding SPAM vectors. "frobenius" (the default) should be used here, as "fidelity" and "tracedist" compare the "SPAM gates" -- the outer product of state prep and POVM effect vectors -- which isn't a meaningful metric.

The cell below demonstrates some of common usages of gaugeopt_to_target.

In [8]:

mdl = mdl_lsgst.copy() #we'll use the MLGST result from above as an example
mdl_go1 = pygsti.gaugeopt_to_target(mdl, target_model) # optimization to the perfect target gates
mdl_go2 = pygsti.gaugeopt_to_target(mdl, depol_model) # optimization to a "guess" at what the estimate should be
mdl_go3 = pygsti.gaugeopt_to_target(mdl, target_model, {'gates': 1.0, 'spam': 0.01}) 
  # weight the gates differently from the SPAM operations
mdl_go4 = pygsti.gaugeopt_to_target(mdl, target_model, {'gates': 1.0, 'spam': 0.01, 'Gx': 10.0, 'E0': 0.001}) 
  # weight an individual gate/SPAM separately (note the specific gate/SPAM labels always override
  # the more general 'gates' and 'spam' weight values). 
mdl_go5 = pygsti.gaugeopt_to_target(mdl, target_model, gates_metric="tracedist") #use trace distance instead of frobenius

from pygsti.models.gaugegroup import UnitaryGaugeGroup
print("Default gauge group = ",type(mdl.default_gauge_group)) # default is FullGaugeGroup
mdl_go6 = pygsti.gaugeopt_to_target(mdl, target_model, gauge_group=UnitaryGaugeGroup(1, 'pp'))
  #gauge optimize only over unitary gauge transformations

print("\ngaugeopt_to_target output:")
mdl_go7 = pygsti.gaugeopt_to_target(mdl, target_model, verbosity=3) # show output
print("Final frobenius distance between mdl_go7 and target_model = ", mdl_go7.frobeniusdist(target_model))

Default gauge group =  <class 'pygsti.models.gaugegroup.FullGaugeGroup'>

gaugeopt_to_target output:
  --- Gauge Optimization (ls method) ---
--- Outer Iter 0: norm_f = 0.0922348, mu=1, |x|=2, |J|=2.69209
--- Outer Iter 1: norm_f = 0.0922334, mu=0.00171089, |x|=2.00005, |J|=2.69205
--- Outer Iter 2: norm_f = 0.0922334, mu=0.000513266, |x|=2.00004, |J|=2.69205
  Least squares message = Both actual and predicted relative reductions in the sum of squares are at most 1e-08
Gauge optimization completed in 0.0521772s.
Final frobenius distance between mdl_go7 and target_model =  0.03920744104767387

In [ ]: