The following instructions were prepared using julia-1.1.1
.
Before exploring the notebook you need to clone the main repository:
git clone https://github.com/kalmarek/1812.03456.git
This notebook should be located in 1812.03456/notebooks
directory.
In the main directory (1812.03456
) you should run the following code in julia
s REPL
console to instantiate the environment for computations:
using Pkg
Pkg.activate(".")
Pkg.instantiate()
(this needs to be done once per installation).
Instantiation should install (among others): the SCS
solver, JuMP
package for mathematical programming and IntervalArithmetic.jl
package from ValidatedNumerics.jl
.
The environment uses Groups.jl
, GroupRings.jl
(which are built on the framework of AbstractAlgebra.jl
) and PropertyT.jl
packages.
The following programme certifies that $$\operatorname{Adj}_4 + \operatorname{Op}_4 - 0.82\Delta_4 =\Sigma_i \xi_i^*\xi_i \in \Sigma^2_2\mathbb{R}\operatorname{SL}(4,\mathbb{Z}).$$
With small changes (which we will indicate) it also certifies that $$\operatorname{Adj}_3 - 0.157999\Delta_3 \in \Sigma^2_2\mathbb{R}\operatorname{SL}(3,\mathbb{Z})$$ and that $$\operatorname{Adj}_5 +1.5 \mathrm{Op}_5 - 1.5\Delta_5 \in \Sigma^2_2\mathbb{R}\operatorname{SL}(5,\mathbb{Z}).$$
using Pkg
Pkg.activate("..")
using Dates
now()
using LinearAlgebra
using AbstractAlgebra
using Groups
using GroupRings
using PropertyT
So far we only made the needed packages available in the notebook.
In the next cell we define G
to be the set of all $4\times 4$ matrices over $\mathbb Z$.
(For the second computation, set N=3
below; for the third, set N=5
)
N = 4
G = MatrixAlgebra(zz, N)
Now we create the elementary matrices $E_{i,j}$. The set of all such matrices and their inverses is denoted by S
.
S = PropertyT.generating_set(G)
Now we will generate the ball E_R
of radius $R=4$ in $\operatorname{SL}(N,\mathbb{Z})$ and use this as a (partial) basis in a group ring (denoted by RG
below). Such group ring also needs a multiplication table (pm
, which is actually a division table) which is created as follows: when $x,y$ reside at positions i
-th and j
-th in E_R
, then pm[i,j] = k
, where k
is the position of $x^{-1}y$ in E_R
.
halfradius = 2
E_R, sizes = Groups.generate_balls(S, radius=2*halfradius);
E_rdict = GroupRings.reverse_dict(E_R)
pm = GroupRings.create_pm(E_R, E_rdict, sizes[halfradius]; twisted=true);
RG = GroupRing(G, E_R, E_rdict, pm)
@show sizes;
Δ = length(S)*one(RG) - sum(RG(s) for s in S)
Now something happens: in the next cell we split the subspace of $\mathbb{R} \operatorname{SL}(N, \mathbb{Z})$ supported on E_R
into irreducible representations of the wreath product $\mathbb Z / 2 \mathbb Z \wr \operatorname{Sym}_N$. The action of wreath product on the elements of the matrix space is by conjugation, i.e. permutation of rows and columns.
We also compute projections on the invariant subspaces to later speed up the optimisation step.
od = PropertyT.OrbitData(RG, WreathProduct(PermGroup(2), PermGroup(N)))
orbit_data = PropertyT.decimate(od);
Now we define the elements $\operatorname{Adj}_N$ and $\operatorname{Op}_N$. The functions Sq
, Adj
, Op
returning the appropriate elements are defined in the src/sqadjop.jl
source file.
@time AdjN = PropertyT.Adj(RG, N)
@time OpN = PropertyT.Op(RG, N);
Finally we compute the element elt
of our interest:
N=3
: $\operatorname{elt} = \operatorname{Adj}_3$N=4
: $\operatorname{elt} = \operatorname{Adj}_4 + \operatorname{Op}_4$N=5
: $\operatorname{elt} = \operatorname{Adj}_5 + 1.5\operatorname{Op}_5.$if N == 3
k = 0
elseif N == 4
k = 1
elseif N == 5
k = 1.5
end
elt = AdjN + k*OpN;
elt.coeffs
We are ready to define the optimisation problem. Function
PropertyT.SOS_problem(x, Δ, orbit_data; upper_bound=UB)
defines the optimisation problem equivalent to the one of the form \begin{align} \text{ maximize : } \quad & \lambda\\ \text{under constraints : }\quad & 0 \leqslant \lambda \leqslant \operatorname{UB},\\ & x - \lambda \Delta = \sum \xi_i^* \xi_i,\\ & \text{each $\xi_i$ is invariant under $\mathbb{Z}/2\mathbb{Z} \wr \operatorname{Sym}_N$}. \end{align}
# @time SDP_problem, varλ, varP = PropertyT.SOS_problem(elt, Δ, orbit_data)
if N == 3
UB = 0.158
elseif N == 4
UB = 0.82005
elseif N == 5
UB = 1.5005
end
SDP_problem, varP = PropertyT.SOS_problem(elt, Δ, orbit_data; upper_bound=UB)
using JuMP
using SCS
λ = Ps = warm = nothing
Depending on the actual problem one may need to tweak the parameters given to the solver:
eps
sets the requested accuracymax_iters
sets the number of iterations to run before solver gives upalpha
is a parameter ($\alpha \in (0,2)$) which determines the rate of convergence at the cost of the accuracyacceleration_lookback
: if you experience numerical instability in scs log should be changed to 1
(at the cost of rate of convergence).
The parameters below should be enough to obtain a decent solution for $\operatorname{SL}(4, \mathbb{Z}), \operatorname{SL}(5, \mathbb{Z})$.
For $\operatorname{SL}(3, \mathbb{Z})$ approximately 1_000_000
of iterations is required; in this case by changing UB
to $0.15$ (above) a much faster convergence can be observed.
with_SCS = with_optimizer(SCS.Optimizer,
linear_solver=SCS.Direct,
eps=3e-13,
max_iters=10000,
alpha=1.5,
acceleration_lookback=10,
warm_start=true)
status, warm = PropertyT.solve(SDP_problem, with_SCS, warm);
λ = value(SDP_problem[:λ])
Ps = [value.(P) for P in varP]
@show(status, λ);
Now we reconstruct the solution to the original problem over $\mathbb{R} \operatorname{SL}(N,\mathbb{Z})$, which essentially boils down to averaging the obtained solution over the orbits of wreath product action: $$Q=\frac{1}{|\Sigma|}\sum_{\sigma\in\Sigma}\sum_{\pi\in \widehat{\Sigma}} \dim{\pi}\cdot\sigma\left(U_{\pi}^T \sqrt{P_{\pi}} U_{\pi}\right).$$
Qs = real.(sqrt.(Ps));
Q = PropertyT.reconstruct(Qs, orbit_data);
As explained in the paper the columns of the square-root of the solution matrix provide the coefficients for $\xi_i$'s in basis E_R
of the group ring. Below we compute the residual
$$ b = \left(x - \lambda\Delta\right) - \sum \xi_i^*\xi_i.$$
As we do it in floating-point arithmetic, the result can't be taken seriously.
function SOS_residual(x::GroupRingElem, Q::Matrix)
RG = parent(x)
@time sos = PropertyT.compute_SOS(RG, Q);
return x - sos
end
residual = SOS_residual(elt - λ*Δ, Q)
@show norm(residual, 1);
using IntervalArithmetic
IntervalArithmetic.setrounding(Interval, :tight)
IntervalArithmetic.setformat(sigfigs=12);
Here we resort to interval arithmetic to provide certified upper and lower bounds on the norm of the residual.
Q
to narrow intervalsQ
so that $0$ is in the sum of coefficients of each column (i.e. $\xi_i \in I \operatorname{SL}(N,\mathbb{Z})$)The returned check_columns_augmentation
is a boolean flag to detect if the projection was successful, i.e. if we can guarantee that each column of Q_aug
can be represented by an element from the augmentation ideal. (If it were not successful, one may project Q = PropertyT.augIdproj(Q)
in the floating point arithmetic prior to the cell below).
The resulting norm of the residual is guaranteed to be contained in the resulting interval. E.g. if each entry of Q
were changed into an honest rational number and all the computations were carried out in rational arithmetic, the rational $\ell_1$-norm will be contained in the interval $\ell_1$-norm.
Q_aug, check_columns_augmentation = PropertyT.augIdproj(Interval, Q);
@assert check_columns_augmentation
elt_int = elt - @interval(λ)*Δ;
residual_int = SOS_residual(elt_int, Q_aug)
@show norm(residual_int, 1);
certified_λ = @interval(λ) - 2^2*norm(residual_int,1)
So $\operatorname{elt} - \lambda_0 \Delta \in \Sigma^2 I\operatorname{SL}(N, \mathbb{Z})$, where as $\lambda_0$ we could take the left end of the above interval:
certified_λ.lo
using Dates
now()
versioninfo()