You are seeing the notebook output generated by Literate.jl from the Julia source file. The rendered HTML can be viewed in the docs.
In this notebook, we apply Gaussian process regression to the Mauna Loa CO₂ dataset. This showcases a rich combination of kernels, and how to handle and optimize all their parameters.
We make use of the following packages:
using CSV, DataFrames # data loading
using AbstractGPs # exact GP regression
using ParameterHandling # for nested and constrained parameters
using Optim # optimization
using Zygote # auto-diff gradient computation
using Plots # visualisation
Let's load and visualize the dataset.
Tip
The
let
block creates a new scope, so any utility variables we define in here won't leak outside. This is particularly helpful to keep notebooks tidy! The return value of the block is given by its last expression.
(xtrain, ytrain), (xtest, ytest) = let
data = CSV.read(joinpath("/home/runner/work/AbstractGPs.jl/AbstractGPs.jl/examples/1-mauna-loa", "CO2_data.csv"), Tables.matrix; header=0)
year = data[:, 1]
co2 = data[:, 2]
# We split the data into training and testing set:
idx_train = year .< 2004
idx_test = .!idx_train
(year[idx_train], co2[idx_train]), (year[idx_test], co2[idx_test]) # block's return value
end
# The utility variables such as `idx_train` and `idx_test` are not available outside the `let` scope
function plotdata()
plot(; xlabel="year", ylabel="CO₂ [ppm]", legend=:bottomright)
scatter!(xtrain, ytrain; label="training data", ms=2, markerstrokewidth=0)
return scatter!(xtest, ytest; label="test data", ms=2, markerstrokewidth=0)
end
plotdata()
We will model this dataset using a sum of several kernels which describe
For more details, see Rasmussen & Williams (2005), chapter 5.
We will use
ParameterHandling.jl for
handling the (hyper)parameters of our model. It provides functions such as
positive
with which we can put constraints on the hyperparameters, and
allows us to represent all required parameters as a nested NamedTuple:
# initial values to match http://stor-i.github.io/GaussianProcesses.jl/latest/mauna_loa/
θ_init = (;
se_long = (;
σ = positive(exp(4.0)),
ℓ = positive(exp(4.0)),
),
seasonal = (;
# product kernels only need a single overall signal variance
per = (;
ℓ = positive(exp(0.0)), # relative to period!
p = fixed(1.0), # 1 year, do not optimize over
),
se = (;
σ = positive(exp(1.0)),
ℓ = positive(exp(4.0)),
),
),
rq = (;
σ = positive(exp(0.0)),
ℓ = positive(exp(0.0)),
α = positive(exp(-1.0)),
),
se_short = (;
σ = positive(exp(-2.0)),
ℓ = positive(exp(-2.0)),
),
noise_scale = positive(exp(-2.0)),
)
(se_long = (σ = ParameterHandling.Positive{Float64, typeof(exp), Float64}(3.9999999997270757, exp, 1.4901161193847656e-8), ℓ = ParameterHandling.Positive{Float64, typeof(exp), Float64}(3.9999999997270757, exp, 1.4901161193847656e-8)), seasonal = (per = (ℓ = ParameterHandling.Positive{Float64, typeof(exp), Float64}(-1.490116130486996e-8, exp, 1.4901161193847656e-8), p = ParameterHandling.Fixed{Float64}(1.0)), se = (σ = ParameterHandling.Positive{Float64, typeof(exp), Float64}(0.9999999945181691, exp, 1.4901161193847656e-8), ℓ = ParameterHandling.Positive{Float64, typeof(exp), Float64}(3.9999999997270757, exp, 1.4901161193847656e-8))), rq = (σ = ParameterHandling.Positive{Float64, typeof(exp), Float64}(-1.490116130486996e-8, exp, 1.4901161193847656e-8), ℓ = ParameterHandling.Positive{Float64, typeof(exp), Float64}(-1.490116130486996e-8, exp, 1.4901161193847656e-8), α = ParameterHandling.Positive{Float64, typeof(exp), Float64}(-1.0000000405055565, exp, 1.4901161193847656e-8)), se_short = (σ = ParameterHandling.Positive{Float64, typeof(exp), Float64}(-2.000000110105522, exp, 1.4901161193847656e-8), ℓ = ParameterHandling.Positive{Float64, typeof(exp), Float64}(-2.000000110105522, exp, 1.4901161193847656e-8)), noise_scale = ParameterHandling.Positive{Float64, typeof(exp), Float64}(-2.000000110105522, exp, 1.4901161193847656e-8))
We define a couple of helper functions to simplify the kernel construction:
SE(θ) = θ.σ^2 * with_lengthscale(SqExponentialKernel(), θ.ℓ)
Per(θ) = with_lengthscale(PeriodicKernel(; r=[θ.ℓ / 2]), θ.p) # NOTE- discrepancy with GaussianProcesses.jl
RQ(θ) = θ.σ^2 * with_lengthscale(RationalQuadraticKernel(; α=θ.α), θ.ℓ)
RQ (generic function with 1 method)
This allows us to write a function that, given the nested tuple of parameter values, constructs the GP prior:
function build_gp_prior(θ)
k_smooth_trend = SE(θ.se_long)
k_seasonality = Per(θ.seasonal.per) * SE(θ.seasonal.se)
k_medium_term_irregularities = RQ(θ.rq)
k_noise_terms = SE(θ.se_short) + θ.noise_scale^2 * WhiteKernel()
kernel = k_smooth_trend + k_seasonality + k_medium_term_irregularities + k_noise_terms
return GP(kernel) # `ZeroMean` mean function by default
end
build_gp_prior (generic function with 1 method)
To construct the posterior, we need to first build a FiniteGP
,
which represents the infinite-dimensional GP at a finite number of input
features:
function build_finite_gp(θ)
f = build_gp_prior(θ)
return f(xtrain)
end
build_finite_gp (generic function with 1 method)
WhiteKernel
vsFiniteGP
observation noiseIn this notebook, we already included observation noise through the
WhiteKernel
as part of the GP prior covariance inbuild_gp_prior
. We therefore callf(xtrain)
which implies zero (additional) observation noise.
Alternatively, we could have omitted the `θ.noise_scale^2 *
WhiteKernel()` term and instead passed the noise variance as a second
argument to the GP call in `build_finite_gp`, `f(xtrain,
θ.noise_scale^2)`.
These two approaches have slightly different semantics: In the first one,
the `WhiteKernel` contributes non-zero variance to the `[i, j]` element
of the covariance matrix of the `FiniteGP` if `xtrain[i] == xtrain[j]`
(based on the values of the features). In the second one, the observation
noise variance passed to `FiniteGP` only contributes to the diagonal
elements of the covariance matrix, i.e. for `i == j`.
Moreover, the variance (uncertainty) of the posterior predictions
includes the variance from the `WhiteKernel`, but does not include the
variance of the observation noise passed to the `FiniteGP`. To include
the observation noise in posterior predictions from the second approach,
call `fpost_opt(xtest, noise_scale^2)`.
Tip
For most use-cases and if in any doubt, we recommend that you pass in observation noise to the
FiniteGP
, and omit the explicitWhiteKernel
. This is slightly faster (no need to checkxtrain[i] == xtrain[j]
for all pairsi
,j
), andWhiteKernel
will not give stable gradients if you wish to compute the gradient of the log marginal likelihoodlogpdf(f(x), y)
w.r.t.x
.
We obtain the posterior, conditioned on the (finite) observations, by calling
posterior
:
function build_posterior_gp(θ)
fx = build_finite_gp(θ)
return posterior(fx, ytrain)
end
build_posterior_gp (generic function with 1 method)
Now we can put it all together to obtain a PosteriorGP
.
The call to ParameterHandling.value
is required to replace the constraints
(such as positive
in our case) with concrete numbers:
fpost_init = build_posterior_gp(ParameterHandling.value(θ_init))
AbstractGPs.PosteriorGP{AbstractGPs.GP{AbstractGPs.ZeroMean{Float64}, KernelFunctions.KernelSum{Tuple{KernelFunctions.ScaledKernel{KernelFunctions.TransformedKernel{KernelFunctions.SqExponentialKernel{Distances.Euclidean}, KernelFunctions.ScaleTransform{Float64}}, Float64}, KernelFunctions.KernelProduct{Tuple{KernelFunctions.TransformedKernel{KernelFunctions.PeriodicKernel{Float64}, KernelFunctions.ScaleTransform{Float64}}, KernelFunctions.ScaledKernel{KernelFunctions.TransformedKernel{KernelFunctions.SqExponentialKernel{Distances.Euclidean}, KernelFunctions.ScaleTransform{Float64}}, Float64}}}, KernelFunctions.ScaledKernel{KernelFunctions.TransformedKernel{KernelFunctions.RationalQuadraticKernel{Float64, Distances.Euclidean}, KernelFunctions.ScaleTransform{Float64}}, Float64}, KernelFunctions.ScaledKernel{KernelFunctions.TransformedKernel{KernelFunctions.SqExponentialKernel{Distances.Euclidean}, KernelFunctions.ScaleTransform{Float64}}, Float64}, KernelFunctions.ScaledKernel{KernelFunctions.WhiteKernel, Float64}}}}, @NamedTuple{α::Vector{Float64}, C::LinearAlgebra.Cholesky{Float64, Matrix{Float64}}, x::Vector{Float64}, δ::Vector{Float64}}}(AbstractGPs.GP{AbstractGPs.ZeroMean{Float64}, KernelFunctions.KernelSum{Tuple{KernelFunctions.ScaledKernel{KernelFunctions.TransformedKernel{KernelFunctions.SqExponentialKernel{Distances.Euclidean}, KernelFunctions.ScaleTransform{Float64}}, Float64}, KernelFunctions.KernelProduct{Tuple{KernelFunctions.TransformedKernel{KernelFunctions.PeriodicKernel{Float64}, KernelFunctions.ScaleTransform{Float64}}, KernelFunctions.ScaledKernel{KernelFunctions.TransformedKernel{KernelFunctions.SqExponentialKernel{Distances.Euclidean}, KernelFunctions.ScaleTransform{Float64}}, Float64}}}, KernelFunctions.ScaledKernel{KernelFunctions.TransformedKernel{KernelFunctions.RationalQuadraticKernel{Float64, Distances.Euclidean}, KernelFunctions.ScaleTransform{Float64}}, Float64}, KernelFunctions.ScaledKernel{KernelFunctions.TransformedKernel{KernelFunctions.SqExponentialKernel{Distances.Euclidean}, KernelFunctions.ScaleTransform{Float64}}, Float64}, KernelFunctions.ScaledKernel{KernelFunctions.WhiteKernel, Float64}}}}(AbstractGPs.ZeroMean{Float64}(), Sum of 5 kernels: Squared Exponential Kernel (metric = Distances.Euclidean(0.0)) - Scale Transform (s = 0.018315638888734182) - σ² = 2980.957987041728 Product of 2 kernels: Periodic Kernel, length(r) = 1 - Scale Transform (s = 1.0) Squared Exponential Kernel (metric = Distances.Euclidean(0.0)) - Scale Transform (s = 0.018315638888734182) - σ² = 7.3890560989306495 Rational Quadratic Kernel (α = 0.36787944117144233, metric = Distances.Euclidean(0.0)) - Scale Transform (s = 1.0) - σ² = 1.0 Squared Exponential Kernel (metric = Distances.Euclidean(0.0)) - Scale Transform (s = 7.3890560989306495) - σ² = 0.018315638888734182 White Kernel - σ² = 0.018315638888734182), (α = [-9.508930877725136, 19.389883608061695, -7.197900325224828, -4.730615302853674, -15.23471364637827, 18.62620628181198, 7.526138666092433, -0.03155846836128887, -13.788654737243691, 1.6524756606264315 … -4.04659725976334, -2.9334457010779236, 2.099011615147155, 4.200011736661871, 5.696790416685783, -11.011994620197953, 9.404617195251152, -5.495561190687362, 4.121759676625851, -1.6600889945065676], C = LinearAlgebra.Cholesky{Float64, Matrix{Float64}}([54.67525650985495 54.657791476465945 … 38.45054512284434 38.41521522921301; 2988.428769237939 1.381848520498082 … 2.1695516564590633 1.8834148423466848; … ; 2102.293417535266 2104.6198692274925 … 0.2317707151386545 0.1990883709260124; 2100.3617465385078 2102.293417535265 … 2988.4287692372563 0.2317686017127864], 'U', 0), x = [1958.2083333333333, 1958.2916666666667, 1958.375, 1958.4583333333333, 1958.5416666666667, 1958.625, 1958.7083333333333, 1958.7916666666667, 1958.875, 1958.9583333333333 … 2003.2083333333333, 2003.2916666666667, 2003.375, 2003.4583333333333, 2003.5416666666667, 2003.625, 2003.7083333333333, 2003.7916666666667, 2003.875, 2003.9583333333333], δ = [315.71, 317.45, 317.5, 317.1, 315.86, 314.93, 313.2, 312.66, 313.33, 314.67 … 376.48, 377.74, 378.5, 378.18, 376.72, 374.31, 373.2, 373.1, 374.64, 375.93]))
Let's visualize what the GP fitted to the data looks like, for the initial choice of kernel hyperparameters.
We use the following function to plot a GP f
on a specific range, using the
AbstractGPs plotting
recipes.
By setting ribbon_scale=2
we visualize the uncertainty band with ±2
(instead of the default ±1) standard deviations.
plot_gp!(f; label) = plot!(f(1920:0.2:2030); ribbon_scale=2, linewidth=1, label)
plotdata()
plot_gp!(fpost_init; label="posterior f(⋅)")
A reasonable fit to the data, but poor extrapolation away from the observations!
function loss(θ)
fx = build_finite_gp(θ)
lml = logpdf(fx, ytrain) # this computes the log marginal likelihood
return -lml
end
loss (generic function with 1 method)
Work-in-progress
In the future, we are planning to provide the
optimize_loss
utility function as part of JuliaGaussianProcesses -- for now, we just define it inline.
The L-BFGS parameters were chosen because they seem to work well empirically. You could also try with the defaults.
default_optimizer = LBFGS(;
alphaguess=Optim.LineSearches.InitialStatic(; scaled=true),
linesearch=Optim.LineSearches.BackTracking(),
)
function optimize_loss(loss, θ_init; optimizer=default_optimizer, maxiter=1_000)
options = Optim.Options(; iterations=maxiter, show_trace=true)
θ_flat_init, unflatten = ParameterHandling.value_flatten(θ_init)
loss_packed = loss ∘ unflatten
# https://julianlsolvers.github.io/Optim.jl/stable/#user/tipsandtricks/#avoid-repeating-computations
function fg!(F, G, x)
if F !== nothing && G !== nothing
val, grad = Zygote.withgradient(loss_packed, x)
G .= only(grad)
return val
elseif G !== nothing
grad = Zygote.gradient(loss_packed, x)
G .= only(grad)
return nothing
elseif F !== nothing
return loss_packed(x)
end
end
result = optimize(Optim.only_fg!(fg!), θ_flat_init, optimizer, options; inplace=false)
return unflatten(result.minimizer), result
end
optimize_loss (generic function with 1 method)
We now run the optimization:
θ_opt, opt_result = optimize_loss(loss, θ_init)
opt_result
Iter Function value Gradient norm 0 2.285662e+02 3.466984e+02 * time: 4.506111145019531e-5 1 1.597968e+02 5.811062e+01 * time: 0.5265400409698486 2 1.516902e+02 4.934826e+01 * time: 1.0012779235839844 3 1.311976e+02 4.436594e+01 * time: 1.2326030731201172 4 1.250485e+02 2.431021e+01 * time: 1.7128939628601074 5 1.208493e+02 7.951627e+00 * time: 1.9376680850982666 6 1.190798e+02 9.555929e+00 * time: 2.411310911178589 7 1.172102e+02 7.308331e+00 * time: 2.6527140140533447 8 1.161961e+02 5.340958e+00 * time: 3.1252670288085938 9 1.157434e+02 2.651424e+00 * time: 3.3527801036834717 10 1.156009e+02 1.236270e+00 * time: 3.5666980743408203 11 1.155254e+02 7.741628e-01 * time: 4.041176080703735 12 1.154964e+02 3.221284e-01 * time: 4.284119129180908 13 1.154897e+02 1.036930e+00 * time: 4.794978141784668 14 1.154860e+02 1.210659e+00 * time: 5.05358099937439 15 1.154829e+02 3.334882e-01 * time: 5.5147199630737305 16 1.154811e+02 2.493652e-01 * time: 5.723245143890381 17 1.154783e+02 5.218759e-01 * time: 5.967154026031494 18 1.154739e+02 6.706579e-01 * time: 6.4355480670928955 19 1.154657e+02 2.451446e+00 * time: 6.688038110733032 20 1.154480e+02 1.479048e+00 * time: 6.903762102127075 21 1.154182e+02 4.592655e-01 * time: 7.37571907043457 22 1.153886e+02 1.267364e+00 * time: 7.612437009811401 23 1.153628e+02 9.849155e-01 * time: 7.826198101043701 24 1.153570e+02 1.646326e+00 * time: 8.038887023925781 25 1.153257e+02 2.741751e+00 * time: 8.51312804222107 26 1.153176e+02 8.952727e-01 * time: 8.72532606124878 27 1.153146e+02 4.115828e-01 * time: 8.945369958877563 28 1.153093e+02 4.487567e-01 * time: 9.42802906036377 29 1.153034e+02 4.656332e-01 * time: 9.683174133300781 30 1.153010e+02 2.814319e-01 * time: 9.95576000213623 31 1.152961e+02 2.402583e-01 * time: 10.42322301864624 32 1.152938e+02 5.287938e-01 * time: 10.672600984573364 33 1.152920e+02 1.309567e-01 * time: 10.88507890701294 34 1.152914e+02 7.550968e-02 * time: 11.09777307510376 35 1.152911e+02 1.123292e-01 * time: 11.599746942520142 36 1.152905e+02 1.044834e-01 * time: 11.827867031097412 37 1.152902e+02 4.074496e-01 * time: 12.10360598564148 38 1.152899e+02 1.512259e-01 * time: 12.592885971069336 39 1.152894e+02 1.065919e-01 * time: 12.86594295501709 40 1.152890e+02 2.087131e-01 * time: 13.32337212562561 41 1.152887e+02 1.105464e-01 * time: 13.557969093322754 42 1.152886e+02 1.300625e-01 * time: 13.778734922409058 43 1.152884e+02 1.726311e-01 * time: 13.996907949447632 44 1.152883e+02 1.001641e-01 * time: 14.48740005493164 45 1.152882e+02 2.566597e-02 * time: 14.753740072250366 46 1.152881e+02 3.000688e-02 * time: 14.982991933822632 47 1.152880e+02 8.008848e-02 * time: 15.468786001205444 48 1.152879e+02 7.895790e-02 * time: 15.735589981079102 49 1.152878e+02 2.456469e-01 * time: 15.98135495185852 50 1.152876e+02 1.293619e-01 * time: 16.44363498687744 51 1.152874e+02 4.788862e-02 * time: 16.659332990646362 52 1.152874e+02 3.378284e-02 * time: 16.88279891014099 53 1.152873e+02 6.138932e-02 * time: 17.101464986801147 54 1.152872e+02 1.131192e-01 * time: 17.616990089416504 55 1.152870e+02 6.160510e-02 * time: 17.85941195487976 56 1.152868e+02 7.670725e-02 * time: 18.06210494041443 57 1.152865e+02 1.136733e-01 * time: 18.2756769657135 58 1.152861e+02 1.707358e-01 * time: 18.853105068206787 59 1.152859e+02 3.296803e-01 * time: 19.10324001312256 60 1.152857e+02 1.709746e-01 * time: 19.323065042495728 61 1.152855e+02 5.870519e-02 * time: 19.795124053955078 62 1.152855e+02 5.870449e-02 * time: 20.342180013656616 63 1.152855e+02 5.870449e-02 * time: 21.089613914489746 64 1.152855e+02 6.378579e-02 * time: 21.617460012435913 65 1.152855e+02 6.380196e-02 * time: 22.22872805595398 66 1.152855e+02 6.380196e-02 * time: 23.16610598564148
* Status: success * Candidate solution Final objective value: 1.152855e+02 * Found with Algorithm: L-BFGS * Convergence measures |x - x'| = 0.00e+00 ≤ 0.0e+00 |x - x'|/|x'| = 0.00e+00 ≤ 0.0e+00 |f(x) - f(x')| = 0.00e+00 ≤ 0.0e+00 |f(x) - f(x')|/|f(x')| = 0.00e+00 ≤ 0.0e+00 |g(x)| = 6.38e-02 ≰ 1.0e-08 * Work counters Seconds run: 23 (vs limit Inf) Iterations: 66 f(x) calls: 122 ∇f(x) calls: 66
The final value of the log marginal likelihood is:
-opt_result.minimum
-115.28552953175688
Warning
To avoid bad local optima, we could (and should) have carried out several random restarts with different initial values for the hyperparameters, and then picked the result with the highest marginal likelihood. We omit this for simplicity. For more details on how to fit GPs in practice, check out A Practical Guide to Gaussian Processes.
Let's construct the posterior GP with the optimized hyperparameters:
fpost_opt = build_posterior_gp(ParameterHandling.value(θ_opt))
AbstractGPs.PosteriorGP{AbstractGPs.GP{AbstractGPs.ZeroMean{Float64}, KernelFunctions.KernelSum{Tuple{KernelFunctions.ScaledKernel{KernelFunctions.TransformedKernel{KernelFunctions.SqExponentialKernel{Distances.Euclidean}, KernelFunctions.ScaleTransform{Float64}}, Float64}, KernelFunctions.KernelProduct{Tuple{KernelFunctions.TransformedKernel{KernelFunctions.PeriodicKernel{Float64}, KernelFunctions.ScaleTransform{Float64}}, KernelFunctions.ScaledKernel{KernelFunctions.TransformedKernel{KernelFunctions.SqExponentialKernel{Distances.Euclidean}, KernelFunctions.ScaleTransform{Float64}}, Float64}}}, KernelFunctions.ScaledKernel{KernelFunctions.TransformedKernel{KernelFunctions.RationalQuadraticKernel{Float64, Distances.Euclidean}, KernelFunctions.ScaleTransform{Float64}}, Float64}, KernelFunctions.ScaledKernel{KernelFunctions.TransformedKernel{KernelFunctions.SqExponentialKernel{Distances.Euclidean}, KernelFunctions.ScaleTransform{Float64}}, Float64}, KernelFunctions.ScaledKernel{KernelFunctions.WhiteKernel, Float64}}}}, @NamedTuple{α::Vector{Float64}, C::LinearAlgebra.Cholesky{Float64, Matrix{Float64}}, x::Vector{Float64}, δ::Vector{Float64}}}(AbstractGPs.GP{AbstractGPs.ZeroMean{Float64}, KernelFunctions.KernelSum{Tuple{KernelFunctions.ScaledKernel{KernelFunctions.TransformedKernel{KernelFunctions.SqExponentialKernel{Distances.Euclidean}, KernelFunctions.ScaleTransform{Float64}}, Float64}, KernelFunctions.KernelProduct{Tuple{KernelFunctions.TransformedKernel{KernelFunctions.PeriodicKernel{Float64}, KernelFunctions.ScaleTransform{Float64}}, KernelFunctions.ScaledKernel{KernelFunctions.TransformedKernel{KernelFunctions.SqExponentialKernel{Distances.Euclidean}, KernelFunctions.ScaleTransform{Float64}}, Float64}}}, KernelFunctions.ScaledKernel{KernelFunctions.TransformedKernel{KernelFunctions.RationalQuadraticKernel{Float64, Distances.Euclidean}, KernelFunctions.ScaleTransform{Float64}}, Float64}, KernelFunctions.ScaledKernel{KernelFunctions.TransformedKernel{KernelFunctions.SqExponentialKernel{Distances.Euclidean}, KernelFunctions.ScaleTransform{Float64}}, Float64}, KernelFunctions.ScaledKernel{KernelFunctions.WhiteKernel, Float64}}}}(AbstractGPs.ZeroMean{Float64}(), Sum of 5 kernels: Squared Exponential Kernel (metric = Distances.Euclidean(0.0)) - Scale Transform (s = 0.006433770319418152) - σ² = 193702.65168343997 Product of 2 kernels: Periodic Kernel, length(r) = 1 - Scale Transform (s = 1.0) Squared Exponential Kernel (metric = Distances.Euclidean(0.0)) - Scale Transform (s = 0.012141608741588681) - σ² = 6.021824896021731 Rational Quadratic Kernel (α = 0.000414331119017645, metric = Distances.Euclidean(0.0)) - Scale Transform (s = 0.03460188893362941) - σ² = 319.30676488252783 Squared Exponential Kernel (metric = Distances.Euclidean(0.0)) - Scale Transform (s = 8.754567915780282) - σ² = 0.028265792449666365 White Kernel - σ² = 0.03577915464642226), (α = [-6.405054233291955, 9.478105306102208, -3.9465447794003228, -2.2960659942451325, -7.262283464584002, 10.568438906760933, 4.764817643464915, -0.4524530569754075, -5.3032225545822405, 0.6818717130531068 … -2.41027458635354, -1.850374273887441, -0.08769478252709415, 2.9206620881820915, 3.8602291756555536, -4.849362941092445, 5.784746509096768, -2.596412769248565, 1.6291888793670009, -1.5585572426563472], C = LinearAlgebra.Cholesky{Float64, Matrix{Float64}}([440.48614543270895 440.4851485165153 … 421.8992014927274 421.83433071242655; 194027.60519039418 0.937152361691982 … 31.17812095066706 30.939700545184966; … ; 185840.75302666932 185869.5510782053 … 0.2838762410681556 0.18244279688867943; 185812.17834670338 185840.75302666402 … 194027.6051903926 0.2838618686957846], 'U', 0), x = [1958.2083333333333, 1958.2916666666667, 1958.375, 1958.4583333333333, 1958.5416666666667, 1958.625, 1958.7083333333333, 1958.7916666666667, 1958.875, 1958.9583333333333 … 2003.2083333333333, 2003.2916666666667, 2003.375, 2003.4583333333333, 2003.5416666666667, 2003.625, 2003.7083333333333, 2003.7916666666667, 2003.875, 2003.9583333333333], δ = [315.71, 317.45, 317.5, 317.1, 315.86, 314.93, 313.2, 312.66, 313.33, 314.67 … 376.48, 377.74, 378.5, 378.18, 376.72, 374.31, 373.2, 373.1, 374.64, 375.93]))
This is the kernel with the point-estimated hyperparameters:
fpost_opt.prior.kernel
Sum of 5 kernels: Squared Exponential Kernel (metric = Distances.Euclidean(0.0)) - Scale Transform (s = 0.006433770319418152) - σ² = 193702.65168343997 Product of 2 kernels: Periodic Kernel, length(r) = 1 - Scale Transform (s = 1.0) Squared Exponential Kernel (metric = Distances.Euclidean(0.0)) - Scale Transform (s = 0.012141608741588681) - σ² = 6.021824896021731 Rational Quadratic Kernel (α = 0.000414331119017645, metric = Distances.Euclidean(0.0)) - Scale Transform (s = 0.03460188893362941) - σ² = 319.30676488252783 Squared Exponential Kernel (metric = Distances.Euclidean(0.0)) - Scale Transform (s = 8.754567915780282) - σ² = 0.028265792449666365 White Kernel - σ² = 0.03577915464642226
Let's print the optimized values of the hyperparameters in a more helpful format:
Work-in-progress
This is another utility function we would eventually like to move out of this notebook:
using Printf
show_params(nt::Union{Dict,NamedTuple}) = String(take!(show_params(IOBuffer(), nt)))
function show_params(io, nt::Union{Dict,NamedTuple}, indent::Int=0)
for (s, v) in pairs(nt)
if v isa Union{Dict,NamedTuple}
println(io, " "^indent, s, ":")
show_params(io, v, indent + 4)
else
println(io, " "^indent, s, " = ", @sprintf("%.3f", v))
end
end
return io
end
print(show_params(ParameterHandling.value(θ_opt)))
se_long: σ = 440.117 ℓ = 155.430 seasonal: per: ℓ = 1.458 p = 1.000 se: σ = 2.454 ℓ = 82.361 rq: σ = 17.869 ℓ = 28.900 α = 0.000 se_short: σ = 0.168 ℓ = 0.114 noise_scale = 0.189
And, finally, we can visualize our optimized posterior GP:
plotdata()
plot_gp!(fpost_opt; label="optimized posterior f(⋅)")
Status `~/work/AbstractGPs.jl/AbstractGPs.jl/examples/1-mauna-loa/Project.toml` [99985d1d] AbstractGPs v0.5.23 `/home/runner/work/AbstractGPs.jl/AbstractGPs.jl#28319c5` [336ed68f] CSV v0.10.15 [a93c6f00] DataFrames v1.7.0 [98b081ad] Literate v2.20.1 [429524aa] Optim v1.11.0 [2412ca09] ParameterHandling v0.5.0 [91a5bcdd] Plots v1.40.10 ⌅ [e88e6eb3] Zygote v0.6.75 Info Packages marked with ⌅ have new versions available but compatibility constraints restrict them from upgrading. To see why use `status --outdated`To reproduce this notebook's package environment, you can download the full Manifest.toml.
Julia Version 1.11.4 Commit 8561cc3d68d (2025-03-10 11:36 UTC) Build Info: Official https://julialang.org/ release Platform Info: OS: Linux (x86_64-linux-gnu) CPU: 4 × AMD EPYC 7763 64-Core Processor WORD_SIZE: 64 LLVM: libLLVM-16.0.6 (ORCJIT, znver3) Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores) Environment: JULIA_DEBUG = Documenter JULIA_LOAD_PATH = :/home/runner/.julia/packages/JuliaGPsDocs/7M86H/src
This notebook was generated using Literate.jl.