Once we have data for GST, there are several algorithms we can run on it to produce tomographic estimates. Depending on the amount of data you have, and time available for running LinearOperator Set Tomography, one algorithm may be preferable over the others. What is typically thought of as "standard GST" is the iterative maximum-likelihood optimization implemented by run_iterative_gst
which optimizes one or more objective functions iteratively, meaning it daisy-chains multiple optimizations, changing (typically adding to) the circuits under consideration or the objective function itself (e.g. from $\chi^2$ to the log-likelihood).
pygsti
contains two main low-level algorithms that form the base of the GST protocol(s):
Linear gate set tomography (LGST): Uses short operation sequences to quickly compute a rough (low accuracy) estimate of a model by linear inversion. The model estimate produced has a very specific form: it holds the dense process matrices of each "gate" (or circuit layer) under consideration.
Long-sequence gate set tomography (LSGST): Minimizes an objective function that quantifies a badness-of-fit between a model and data set. Here "sequence" is synonymous with "circuit". The optimization is performed in stages, using successively larger sets of longer and longer circuits.
If you're curious, the implementation of these algorithms are found in the pygsti.algorithms.core
module. In this tutorial, we'll show how to invoke each of these low-level algorithms directly. This is probably only useful when building your own protocol that needs to borrow pieces of GST without borrowing the entire protocol.
Additionally, pygsti
contains gauge-optimization algorithms. Because the outcome data (the input to the GST algorithms above) only determines a model up to some number of un-physical "gauge" degrees of freedom, it is often desirable to optimize the Model
estimate obtained from a GST algorithm within the space of its gauge freedoms. This process is called "gauge-optimization" and the final part of this tutorial demonstrates how to gauge-optimize a model using various criteria.
The ingredients needed by the LGST and LSGST algorithms are:
Model
which defines the desired gates. This model is used by LGST to specify the various gate, state preparation, POVM effect, and SPAM labels, as well as to provide an initial guess for the gauge degrees of freedom.DataSet
containing the data that GST attempts to fit using the probabilities generated by a single Model
. This data set must at least contain the data for the circuits required by the algorithm that is chosen.Circuit
objects that specify which circuits are used during each iteration of the algorithm (the length of the top-level list defines the number of interations).import pygsti
import json
#We'll use the standard I, X(pi/2), Y(pi/2) model that we generated data for in the DataSet tutorial
from pygsti.modelpacks import smq1Q_XYI
target_model = smq1Q_XYI.target_model()
prep_fiducials = smq1Q_XYI.prep_fiducials()
meas_fiducials = smq1Q_XYI.meas_fiducials()
germs = smq1Q_XYI.germs()
maxLengthList = [1,2,4,8,16] #for use in iterative algorithms
ds = pygsti.io.load_dataset("../../tutorial_files/Example_Dataset.txt", cache=True)
dsLowCounts = pygsti.io.load_dataset("../../tutorial_files/Example_Dataset_LowCnts.txt", cache=True)
depol_model = target_model.depolarize(op_noise=0.1)
print("Loaded target model with operation labels: ", target_model.operations.keys())
print("Loaded fiducial lists of lengths: ", (len(prep_fiducials),len(meas_fiducials)))
print("Loaded dataset of length: ", len(ds))
Reading ../../tutorial_files/Example_Dataset_LowCnts.txt: 100% Writing cache file (to speed future loads): ../../tutorial_files/Example_Dataset_LowCnts.txt.cache Loaded target model with operation labels: odict_keys([Label(()), Label(('Gxpi2', 0)), Label(('Gypi2', 0))]) Loaded fiducial lists of lengths: (6, 6) Loaded dataset of length: 1120
C:\Users\ciostro\Documents\pyGSTi_random_bugfixes\pygsti\tools\legacytools.py:38: UserWarning: The function save is deprecated, and may not be present in future versions of pygsti. Please use write_binary instead. _warnings.warn(message)
An important and distinguising property of the LGST algorithm is that it does not require an initial-guess Model
as an input. It uses linear inversion and short sequences to obtain a rough model estimate. As such, it is very common to use the LGST estimate as the initial-guess starting point for more advanced forms of GST.
#Run LGST to get an initial estimate for the gates in target_model based on the data in ds
#run LGST
mdl_lgst = pygsti.run_lgst(ds, prep_fiducials, meas_fiducials, target_model=target_model, verbosity=1)
#Gauge optimize the result to match the target model
mdl_lgst_after_gauge_opt = pygsti.gaugeopt_to_target(mdl_lgst, target_model, verbosity=1)
#Contract the result to CPTP, guaranteeing that the gates are CPTP
mdl_clgst = pygsti.contract(mdl_lgst_after_gauge_opt, "CPTP", verbosity=1)
--- LGST --- Gauge optimization completed in 0.0500555s. --- Contract to CPTP (direct) --- The closest legal point found was distance: 0.0002851261395880569
print(mdl_lgst)
rho0 = FullState with dimension 4 0.71-0.02 0.03 0.75 Mdefault = UnconstrainedPOVM with effect vectors: 0: FullPOVMEffect with dimension 4 0.73 0 0 0.65 1: FullPOVMEffect with dimension 4 0.69 0 0-0.65 [] = FullArbitraryOp with shape (4, 4) 1.00 0 0 0 0.02 0.88 0.03-0.05 -0.03-0.03 0.88 0.02 0 0 0 0.90 Gxpi2:0 = FullArbitraryOp with shape (4, 4) 1.00 0 0 0 0 0.90 0.03 0 -0.02-0.03-0.03-0.99 -0.06 0.03 0.81 0 Gypi2:0 = FullArbitraryOp with shape (4, 4) 1.00 0 0 0 0 0-0.05 1.00 0.02 0 0.89 0 -0.05-0.80 0 0
Long-sequence GST operates by repeatedly fitting a model to a data set and using the result to seed a similar optimization that compares the data of more circuits. As such a list-of-lists of circuits, one per iteration, is required. The elements of these lists are typically repetitions of short "germ" circuits such that the final circuit does not exceed some maximum length (depth). We use the make_lsgst_lists
function to create such lists below.
#Get lists of operation sequences for successive iterations of LSGST to use
lsgstListOfLists = pygsti.circuits.create_lsgst_circuit_lists(target_model, prep_fiducials, meas_fiducials, germs, maxLengthList)
A single iteration, whereby the parameters of a Model
are adjusted to find the best fit to a DataSet
, is performed by the run_gst_fit
function. The "best fit" is defined as the model parameters which minimize some objective function. run_gst_fit
takes as arguments an a ModelDatasetCircuitStore
, which simply bundles together a Model
, Dataset
, and set of Circuits
, as well as an objective function builder (an object that encapsulates an objective function but lacks some needed context, and can build an objective function given this context). For convience, the run_gst_fit_simple
function abstracts away the ModelDatasetCircuitStore
construction.
For example, the cell below finds the model parameters which minimize the log-likelihood computed over the first list of circuits.
opt_result, mdl_single_optimization = \
pygsti.algorithms.run_gst_fit_simple(ds, mdl_clgst, lsgstListOfLists[0],
optimizer=None, objective_function_builder="logl",
resource_alloc=None, verbosity=1)
MatrixLayout: 1 processors divided into 1 x 1 (= 1) grid along circuit and parameter directions. 1 atoms, parameter block size limits (None,) *** Distributing 1 atoms to 1 atom-processing groups (1 cores) *** More atom-processors than hosts: each host gets ~1 atom-processors Atom-processors already occupy a single node, dividing atom-processor into 1 param-processors. *** Divided 1-host atom-processor (~1 procs) into 1 param-processing groups *** --- dlogl GST --- 2*Delta(log(L)) = 48.8152 (92 data params - 60 (approx) model params = expected mean of 32; p-value = 0.0288947) Completed in 0.2s
The iterative algorithm, which essentially runs run_gst_fit
multiple times, is implemented by run_iterative_gst
. The number of iterations is set by the number of circuit lists provided as the 3rd argument. It takes two lists of objective function builders: the first, iteration_objfn_builders
gives the objective functions to (sequentially) optimize on each iteration, while the second, final_objfn_builders
gives additional objective functions to optimize on the final iteration.
models, opt_results, cache = \
pygsti.algorithms.run_iterative_gst(ds, mdl_clgst, lsgstListOfLists,
optimizer={'tol': 1e-5}, resource_alloc=None, verbosity=2,
iteration_objfn_builders=['chi2'],
final_objfn_builders=['logl'])
mdl_lsgst = models[-1] # the model from the final iteration, i.e. the "final GST model"
--- Iterative GST: Iter 1 of 5 92 circuits ---: MatrixLayout: 1 processors divided into 1 x 1 (= 1) grid along circuit and parameter directions. 1 atoms, parameter block size limits (None,) *** Distributing 1 atoms to 1 atom-processing groups (1 cores) *** More atom-processors than hosts: each host gets ~1 atom-processors Atom-processors already occupy a single node, dividing atom-processor into 1 param-processors. *** Divided 1-host atom-processor (~1 procs) into 1 param-processing groups *** --- chi2 GST --- Sum of Chi^2 = 48.6364 (92 data params - 60 (approx) model params = expected mean of 32; p-value = 0.0300305) Completed in 0.1s Iteration 1 took 0.1s --- Iterative GST: Iter 2 of 5 168 circuits ---: MatrixLayout: 1 processors divided into 1 x 1 (= 1) grid along circuit and parameter directions. 1 atoms, parameter block size limits (None,) *** Distributing 1 atoms to 1 atom-processing groups (1 cores) *** More atom-processors than hosts: each host gets ~1 atom-processors Atom-processors already occupy a single node, dividing atom-processor into 1 param-processors. *** Divided 1-host atom-processor (~1 procs) into 1 param-processing groups *** --- chi2 GST --- Sum of Chi^2 = 123.32 (168 data params - 60 (approx) model params = expected mean of 108; p-value = 0.14878) Completed in 0.2s Iteration 2 took 0.3s --- Iterative GST: Iter 3 of 5 285 circuits ---: MatrixLayout: 1 processors divided into 1 x 1 (= 1) grid along circuit and parameter directions. 1 atoms, parameter block size limits (None,) *** Distributing 1 atoms to 1 atom-processing groups (1 cores) *** More atom-processors than hosts: each host gets ~1 atom-processors Atom-processors already occupy a single node, dividing atom-processor into 1 param-processors. *** Divided 1-host atom-processor (~1 procs) into 1 param-processing groups *** --- chi2 GST --- Sum of Chi^2 = 232.735 (285 data params - 60 (approx) model params = expected mean of 225; p-value = 0.347582) Completed in 0.2s Iteration 3 took 0.3s --- Iterative GST: Iter 4 of 5 448 circuits ---: MatrixLayout: 1 processors divided into 1 x 1 (= 1) grid along circuit and parameter directions. 1 atoms, parameter block size limits (None,) *** Distributing 1 atoms to 1 atom-processing groups (1 cores) *** More atom-processors than hosts: each host gets ~1 atom-processors Atom-processors already occupy a single node, dividing atom-processor into 1 param-processors. *** Divided 1-host atom-processor (~1 procs) into 1 param-processing groups *** --- chi2 GST --- Sum of Chi^2 = 423.121 (448 data params - 60 (approx) model params = expected mean of 388; p-value = 0.105949) Completed in 0.4s Iteration 4 took 0.5s --- Iterative GST: Iter 5 of 5 616 circuits ---: MatrixLayout: 1 processors divided into 1 x 1 (= 1) grid along circuit and parameter directions. 1 atoms, parameter block size limits (None,) *** Distributing 1 atoms to 1 atom-processing groups (1 cores) *** More atom-processors than hosts: each host gets ~1 atom-processors Atom-processors already occupy a single node, dividing atom-processor into 1 param-processors. *** Divided 1-host atom-processor (~1 procs) into 1 param-processing groups *** --- chi2 GST --- Sum of Chi^2 = 583.022 (616 data params - 60 (approx) model params = expected mean of 556; p-value = 0.20683) Completed in 0.5s Iteration 5 took 0.7s Last iteration: --- dlogl GST --- 2*Delta(log(L)) = 583.536 (616 data params - 60 (approx) model params = expected mean of 556; p-value = 0.202591) Completed in 0.7s Final optimization took 0.7s Iterative GST Total Time: 2.7s
All gauge optimization algorithms perform essentially the same task - to find the model which optimizes some objective function from within the set or space of models that are gauge-equivalent to some starting set. This is accomplished in pygsti
using the following mechanism:
ExplicitOpModel
, call it mdl
, to be gauge-optimized.pygsti.objects.GaugeGroup
instance defines a parameterized group of allowable gauge transformations. This gauge group must be compatible with the mdl
's parameterization, so that mdl.transform_inplace
(which calls LinearOperator.transform_inplace
and SPAMVec.transform_inplace
) is able to process elements of the GaugeGroup
(obtained via a call to GaugeGroup.element(params)
). That is, the gauge transformation must map between models with the same parameterization (that give by mdl
). Because of the close interplay between a model's parameterization and its allowed gauge transformations, Model
objects can contain a GaugeGroup
instance as their default_gauge_group
member. In many circumstances, mdl.default_gauge_group
is set to the correct gauge group to use for a given Model
.pygsti.gaugeopt_custom(...)
takes an intial ExplicitOpModel
, an objective function, and a GaugeGroup
(along with other optimization parameters) and returns a gauge-optimized ExplicitOpModel
. Note that if its gauge_group
argument is left as None
, then the model's default gauge group is used. And objective function which takes a single ExplicitOpModel
argument and returns a float can be supplied, giving the user a fair amount of flexiblity.pygsti.gaugeopt_to_target(...)
is a routine able to perform these common types of gauge optimization. Instead of an objective function, gaugeopt_to_target
takes a target ExplicitOpModel
and additional arguments (see below) from which it constructs a objective function and then calls gaugeopt_custom
. It is essetially a convenience routine for constructing common gauge optimization objective functions. Relevant arguments which affect what objective function is used are:target_model
: the ExplicitOpModel
to compare against - i.e., the one you want to gauge optimize toward. Note that this doesn't have to be a set of ideal gates - it can be any (imperfect) model that reflects your expectations about what the estimates should look like.item_weights
: a dictionary of weights allowing different gates and/or SPAM operations to be weighted differently when computing the objective function's value.CPpenalty
: a prefactor multiplying the sum of all the negative Choi-matrix eigenvalues corresponding to each of the gates.TPpenalty
: a prefactor multiplying the sum of absoulte-value differences between the first row of each operation matrix and [1 0 ... 0 ]
and the discrpance between the first element of each state preparation vector and its expected value.validSpamPenalty
: a prefactor multiplying penalty terms enforcing the non-negativity of state preparation eigenavlues and that POVM effect eigenvalues lie between 0 and 1.gates_metric
: how to compare corresponding gates in the gauge-optimized and target sets. "frobenius
" uses the frobenius norm (weighted before taking the final sqrt), "fidelity"
uses the squared process infidelity (squared to avoid negative-infidelity issues in non-TP models), and "tracedist"
uses the trace distance (weighted after computing the trace distance between corresponding gates).spamMetric
: how to compare corresponding SPAM vectors. "frobenius"
(the default) should be used here, as "fidelity"
and "tracedist"
compare the "SPAM gates" -- the outer product of state prep and POVM effect vectors -- which isn't a meaningful metric.The cell below demonstrates some of common usages of gaugeopt_to_target
.
mdl = mdl_lsgst.copy() #we'll use the MLGST result from above as an example
mdl_go1 = pygsti.gaugeopt_to_target(mdl, target_model) # optimization to the perfect target gates
mdl_go2 = pygsti.gaugeopt_to_target(mdl, depol_model) # optimization to a "guess" at what the estimate should be
mdl_go3 = pygsti.gaugeopt_to_target(mdl, target_model, {'gates': 1.0, 'spam': 0.01})
# weight the gates differently from the SPAM operations
mdl_go4 = pygsti.gaugeopt_to_target(mdl, target_model, {'gates': 1.0, 'spam': 0.01, 'Gx': 10.0, 'E0': 0.001})
# weight an individual gate/SPAM separately (note the specific gate/SPAM labels always override
# the more general 'gates' and 'spam' weight values).
mdl_go5 = pygsti.gaugeopt_to_target(mdl, target_model, gates_metric="tracedist") #use trace distance instead of frobenius
from pygsti.models.gaugegroup import UnitaryGaugeGroup
print("Default gauge group = ",type(mdl.default_gauge_group)) # default is FullGaugeGroup
mdl_go6 = pygsti.gaugeopt_to_target(mdl, target_model, gauge_group=UnitaryGaugeGroup(1, 'pp'))
#gauge optimize only over unitary gauge transformations
print("\ngaugeopt_to_target output:")
mdl_go7 = pygsti.gaugeopt_to_target(mdl, target_model, verbosity=3) # show output
print("Final frobenius distance between mdl_go7 and target_model = ", mdl_go7.frobeniusdist(target_model))
Default gauge group = <class 'pygsti.models.gaugegroup.FullGaugeGroup'> gaugeopt_to_target output: --- Gauge Optimization (ls method) --- --- Outer Iter 0: norm_f = 0.0922348, mu=1, |x|=2, |J|=2.69209 --- Outer Iter 1: norm_f = 0.0922334, mu=0.00171089, |x|=2.00005, |J|=2.69205 --- Outer Iter 2: norm_f = 0.0922334, mu=0.000513266, |x|=2.00004, |J|=2.69205 Least squares message = Both actual and predicted relative reductions in the sum of squares are at most 1e-08 Gauge optimization completed in 0.0521772s. Final frobenius distance between mdl_go7 and target_model = 0.03920744104767387