The goal of this notebook is to replicate the results from chapter 5, p.118, of Gaussian Processes for Machine Learning.
I did not find a way on how to make GPflow perform this additive decomposition and therefore I decided to implement it by hand. In this set-up GPflow is used to find the correct parameters and the hand writte code is used to decompose the addtitive parts.
There is currently work under-way to add this functionality to GPflow. For more details have a look at Predicting component contributions #234.
import warnings
from IPython.display import display_html
%load_ext watermark
%watermark -a 'Christian Schuhegger' -u -d -v -p numpy,pandas,matplotlib,seaborn,gpflow,GPy,bokeh,statsmodels
/home/user/cs/local/install/Anaconda3-5.1.0-Linux-x86_64/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`. from ._conv import register_converters as _register_converters
Christian Schuhegger last updated: 2018-05-05 CPython 3.6.4 IPython 6.2.1 numpy 1.14.2 pandas 0.22.0 matplotlib 2.2.2 seaborn 0.8.1 gpflow 1.1.0 GPy 1.9.2 bokeh 0.12.15 statsmodels 0.8.0
%matplotlib inline
import numpy as np, pandas as pd, matplotlib.pyplot as plt, seaborn as sns, gpflow
Let's use the bokeh library to make the chart interactive and so that you can zoom in. The hints on how to use bokeh on these examples came from Looking at the Keeling Curve with GPs in PyMC3.
import bokeh.plotting, bokeh.models, bokeh.io, bokeh.palettes
bokeh.io.output_notebook()
We take the raw data from the statsmodels package:
# The data used to be available at http://cdiac.esd.ornl.gov/ftp/trends/co2/maunaloa.co2,
# but now it seems that more up-to-date data can be found at
# ftp://ftp.cmdl.noaa.gov/ccg/co2/trends/co2_mm_mlo.txt
import statsmodels.api as sm
mauna_loa_atmospheric_CO2_concentration_data = sm.datasets.get_rdataset("CO2")
print(mauna_loa_atmospheric_CO2_concentration_data.__doc__)
data1 = mauna_loa_atmospheric_CO2_concentration_data.data
data1.head()
/home/user/cs/local/install/Anaconda3-5.1.0-Linux-x86_64/lib/python3.6/site-packages/statsmodels/compat/pandas.py:56: FutureWarning:The pandas.core.datetools module is deprecated and will be removed in a future version. Please use the pandas.tseries module instead.
+-----+-----------------+ | co2 | R Documentation | +-----+-----------------+ Mauna Loa Atmospheric CO2 Concentration --------------------------------------- Description ~~~~~~~~~~~ Atmospheric concentrations of CO\ *2* are expressed in parts per million (ppm) and reported in the preliminary 1997 SIO manometric mole fraction scale. Usage ~~~~~ :: co2 Format ~~~~~~ A time series of 468 observations; monthly from 1959 to 1997. Details ~~~~~~~ The values for February, March and April of 1964 were missing and have been obtained by interpolating linearly between the values for January and May of 1964. Source ~~~~~~ Keeling, C. D. and Whorf, T. P., Scripps Institution of Oceanography (SIO), University of California, La Jolla, California USA 92093-0220. ftp://cdiac.esd.ornl.gov/pub/maunaloa-co2/maunaloa.co2. References ~~~~~~~~~~ Cleveland, W. S. (1993) *Visualizing Data*. New Jersey: Summit Press. Examples ~~~~~~~~ :: require(graphics) plot(co2, ylab = expression("Atmospheric concentration of CO"[2]), las = 1) title(main = "co2 data set")
time | value | |
---|---|---|
0 | 1959.000000 | 315.42 |
1 | 1959.083333 | 316.31 |
2 | 1959.166667 | 316.50 |
3 | 1959.250000 | 317.56 |
4 | 1959.333333 | 318.13 |
We will make the data available in two formats, as row vectors as lower case x and y and as column vectors as upper case X and Y.
x = np.array(data1.time)
y = np.array(data1.value)
X = x.reshape(-1, 1)
Y = y.reshape(-1, 1)
p = bokeh.plotting.figure(title="Fit to the Mauna Loa Data", #x_axis_type='datetime',
plot_width=900, plot_height=600)
p.yaxis.axis_label = 'CO2 [ppm]'
p.xaxis.axis_label = 'Date'
# true value
p.circle(x, y, color="black", legend="Observed data")
p.legend.location = "top_left"
bokeh.plotting.show(p)
Please use the GUI elements on the right top of the graph for moving and zooming into the graph.
As GPflow performs some optimization procedure to find the best fit parameters we will set-up the different aspects of the kernel functions with good starting points taken from chapter 5 of Gaussian Processes for Machine Learning. In principle you could use arbitrary starting point values and it is the job of GPflow to find the best fit parameters.
As in chapter 5 of Gaussian Processes for Machine Learning we create the total fit as an additive composition of different aspects.
To be honest, I am not sure why k4 includes the white noise at all, because in principle the gaussian process provides this white noise by default via the σn parameter (see Gaussian Process Parameter Effects for details), which is called m.likelihood.variance
in GPflow.
k1_ = gpflow.kernels.RBF(1, variance=(66.0 ** 2), lengthscales=67.0)
k2_exp_sine_squred_gamma= 2.0 / 1.3 ** 2.0
k2_exp_sine_squred_period = 1.0
k2_ = gpflow.kernels.RBF(1, variance=(2.4 ** 2.0), lengthscales=90.0) * gpflow.kernels.Periodic(1, period=k2_exp_sine_squred_period, variance=1.0, lengthscales=1.0/k2_exp_sine_squred_gamma)
# k3 = how to do a rational quadratic term in GPflow?
k4_ = gpflow.kernels.RBF(1, variance=(0.18 ** 2), lengthscales=1.6) + gpflow.kernels.White(1, variance=0.19)
k1 = k1_
k2 = k2_
k4 = k4_
kernel = k1 + k2 + k4
k_ = kernel
m = gpflow.models.GPR(X, Y, kern=kernel)
m.likelihood.variance = 0.01
m.as_pandas_table()
/home/user/cs/local/install/Anaconda3-5.1.0-Linux-x86_64/lib/python3.6/site-packages/gpflow/densities.py:89: UserWarning:Shape of x must be 2D at computation.
class | prior | transform | trainable | shape | fixed_shape | value | |
---|---|---|---|---|---|---|---|
GPR/kern/rbf_1/variance | Parameter | None | +ve | True | () | True | 4356.0 |
GPR/kern/rbf_1/lengthscales | Parameter | None | +ve | True | () | True | 67.0 |
GPR/kern/product/rbf/variance | Parameter | None | +ve | True | () | True | 5.76 |
GPR/kern/product/rbf/lengthscales | Parameter | None | +ve | True | () | True | 90.0 |
GPR/kern/product/periodic/variance | Parameter | None | +ve | True | () | True | 1.0 |
GPR/kern/product/periodic/lengthscales | Parameter | None | +ve | True | () | True | 0.8450000000000001 |
GPR/kern/product/periodic/period | Parameter | None | +ve | True | () | True | 1.0 |
GPR/kern/rbf_2/variance | Parameter | None | +ve | True | () | True | 0.0324 |
GPR/kern/rbf_2/lengthscales | Parameter | None | +ve | True | () | True | 1.6 |
GPR/kern/white/variance | Parameter | None | +ve | True | () | True | 0.19 |
GPR/likelihood/variance | Parameter | None | +ve | True | () | True | 0.01 |
%%time
opt = gpflow.train.ScipyOptimizer()
opt.minimize(m)
INFO:tensorflow:Optimization terminated with: Message: b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH' Objective function value: 129.032561 Number of iterations: 91 Number of functions evaluations: 133 CPU times: user 3.69 s, sys: 447 ms, total: 4.14 s Wall time: 3.08 s
m.as_pandas_table()
class | prior | transform | trainable | shape | fixed_shape | value | |
---|---|---|---|---|---|---|---|
GPR/kern/rbf_1/variance | Parameter | None | +ve | True | () | True | 4356.219071780713 |
GPR/kern/rbf_1/lengthscales | Parameter | None | +ve | True | () | True | 48.565715481551585 |
GPR/kern/product/rbf/variance | Parameter | None | +ve | True | () | True | 5.64144286809577 |
GPR/kern/product/rbf/lengthscales | Parameter | None | +ve | True | () | True | 91.88714901784704 |
GPR/kern/product/periodic/variance | Parameter | None | +ve | True | () | True | 1.026393528093981 |
GPR/kern/product/periodic/lengthscales | Parameter | None | +ve | True | () | True | 0.6709642212683404 |
GPR/kern/product/periodic/period | Parameter | None | +ve | True | () | True | 0.9997023123140168 |
GPR/kern/rbf_2/variance | Parameter | None | +ve | True | () | True | 0.16043370929892875 |
GPR/kern/rbf_2/lengthscales | Parameter | None | +ve | True | () | True | 0.32753755974365617 |
GPR/kern/white/variance | Parameter | None | +ve | True | () | True | 0.03193821948646309 |
GPR/likelihood/variance | Parameter | None | +ve | True | () | True | 0.010237414042966209 |
Let's create a visualization of the fitted overall model. I will only plot the fit between 1990 and 2000, to on the one hand show how nicely the regression curve fits the existing data points and to show how gaussian processes can be used to predict the near term future behaviour together with some confidence interval.
xx = np.linspace(1990, 2000, 1000).reshape(-1,1)
def plot(m, with_data=True):
if not isinstance(m, list):
m = [m]
predictions = [model.predict_y(xx) for model in m]
p = bokeh.plotting.figure(title="Fit to the Mauna Loa Data", #x_axis_type='datetime',
plot_width=900, plot_height=600)
p.yaxis.axis_label = 'CO2 [ppm]'
p.xaxis.axis_label = 'Date'
if with_data:
# true value
p.circle(x, y, color="black", legend="Observed data")
for mean, var in predictions:
xpts = xx[:,0]
p.line(xpts, mean[:,0], line_width=1)
lowerband = mean[:,0] - 2*np.sqrt(var[:,0])
upperband = mean[:,0] + 2*np.sqrt(var[:,0])
band_x = np.append(xpts, xpts[::-1])
band_y = np.append(lowerband, upperband[::-1])
p.patch(band_x, band_y, color='#7570B3', fill_alpha=0.2)
p.legend.location = "top_left"
bokeh.plotting.show(p)
As you can see in the below figure the complete model matches nicely.
plot(m)
Now let's split the model into components and create model m1 (blue) with only the long term trend and model m2 (green) with only the product term of p.120: k2(x,x′)=θ23⋅exp(−(x−x′)22θ24−2sin2(π(x−x′)/1.0)θ25)=(θ23⋅exp(−(x−x′)22θ24))⋅(1.0⋅exp(−2sin2(π(x−x′)/1.0)θ25))
Where θ3 is the variance
term in the RBF
kernel and therefore the variance
term in the PeriodicKernel
kernel will be 1.0.
It turns out that the simplistic idea on how to decompose the additive components of the model by simply creating partial models is not correct. This would have looked like this:
k1 = GPflow.kernels.RBF(1, variance=m.kern.rbf_1.variance.value, lengthscales=m.kern.rbf_1.lengthscales.value)
m1 = GPflow.gpr.GPR(X, Y, kern=k1)
m1.likelihood.variance = m.likelihood.variance.value
m1.fixed = True
k2 = GPflow.kernels.RBF(1, variance=m.kern.prod.rbf.variance.value, lengthscales=m.kern.prod.rbf.lengthscales.value) \
* \
GPflow.kernels.PeriodicKernel(1, period=m.kern.prod.periodickernel.period.value,
variance=m.kern.prod.periodickernel.variance.value,
lengthscales=m.kern.prod.periodickernel.lengthscales.value)
m2 = GPflow.gpr.GPR(X, Y, kern=k2)
m2.likelihood.variance = m.likelihood.variance.value
m2.fixed = True
k1 = GPflow.kernels.RBF(1, variance=m.kern.rbf_1.variance.value, lengthscales=m.kern.rbf_1.lengthscales.value)
k2 = GPflow.kernels.RBF(1, variance=m.kern.prod.rbf.variance.value, lengthscales=m.kern.prod.rbf.lengthscales.value) \
* \
GPflow.kernels.PeriodicKernel(1, period=m.kern.prod.periodickernel.period.value,
variance=m.kern.prod.periodickernel.variance.value,
lengthscales=m.kern.prod.periodickernel.lengthscales.value)
m3 = GPflow.gpr.GPR(X, Y, kern=(k1+k2))
m3.likelihood.variance = m.likelihood.variance.value
m3.fixed = True
Have a look at The Kernel Cookbook and scroll down to "Additive decomposition" or look at equation 1.17 at the kernels chapter of David Duvenaud's Phd thesis.
There it says (I leave out the μ variables as we set the mean function to be 0): f1(X∗)|f1(X)+f2(X)∼N(K∗1T(K1+K2)−1[f1(X)+f2(X)],...)
I did not find a way on how to make GPflow perform this additive decomposition and therefore I decided to implement it by hand. In this set-up GPflow is used to find the correct parameters and the hand writte code is used to decompose the addtitive parts.
import functools
# http://www.cs.toronto.edu/~duvenaud/cookbook/index.html
class GPCovarianceFunctionWhiteNoise:
def __init__(self, sigma):
self.sigma = float(sigma)
def covariance(self, x1, x2):
lr = np.zeros(shape=(len(x1), len(x2)))
np.fill_diagonal(lr, self.sigma ** 2)
return lr
class GPCovarianceFunctionSquaredExponential:
def __init__(self, l, sigma):
self.l = float(l)
self.sigma = float(sigma)
def covariance(self, x1, x2):
return (self.sigma**2) * np.exp( -0.5 * (np.subtract.outer(x1, x2)**2) / (self.l ** 2))
class GPCovarianceFunctionExpSine2Kernel:
def __init__(self, l, period, sigma):
self.l = float(l)
self.period = float(period)
self.sigma = float(sigma)
def covariance(self, x1, x2):
return (self.sigma**2) * np.exp( -2.0 * np.power(np.sin((np.pi / self.period) * np.abs(np.subtract.outer(x1, x2))),
2) / (self.l ** 2))
class GPCovarianceFunctionSum:
def __init__(self, k1, k2):
self.k1 = k1
self.k2 = k2
def covariance(self, x1, x2):
s1 = self.k1.covariance(x1, x2)
s2 = self.k2.covariance(x1, x2)
s = s1 + s2
return s
class GPCovarianceFunctionProduct:
def __init__(self, k1, k2):
self.k1 = k1
self.k2 = k2
def covariance(self, x1, x2):
p1 = self.k1.covariance(x1, x2)
p2 = self.k2.covariance(x1, x2)
p = p1 * p2
return p
def conditional1(x_new, x, y, cov, sigma_n=0):
if not isinstance(cov, list):
cov = [cov]
if len(cov) < 2:
total_covariance_function = cov[0]
else:
total_covariance_function = functools.reduce(lambda a, x: GPCovarianceFunctionSum(a, x), cov)
A = total_covariance_function.covariance(x_new, x_new)
C = total_covariance_function.covariance(x_new, x)
B = total_covariance_function.covariance(x, x) + np.power(sigma_n,2)*np.diag(np.ones(len(x)))
mu = [np.linalg.inv(B).dot(C.T).T.dot(y).squeeze()]
sigma = [(A - C.dot(np.linalg.inv(B).dot(C.T))).squeeze()]
for i in range(0, len(cov)):
partial_covariance_function = cov[i]
C_ = partial_covariance_function.covariance(x_new, x)
mu_ = np.linalg.inv(B).dot(C_.T).T.dot(y).squeeze()
mu.append(mu_)
return (mu, sigma)
def predict1(x_new, x, y, cov, sigma_n=0):
l_y_pred, l_sigmas = conditional1(x_new, x, y, cov=cov, sigma_n=sigma_n)
if len(l_sigmas[0].shape) > 1:
return l_y_pred, [np.diagonal(ls) for ls in l_sigmas]
else:
return l_y_pred, l_sigmas
m.kern.product.rbf.lengthscales.value
array(91.88714902)
xx = np.linspace(1990, 2000, 1000)
k1 = GPCovarianceFunctionSquaredExponential(l=m.kern.rbf_1.lengthscales.value,
sigma=np.sqrt(m.kern.rbf_1.variance.value))
k2 = GPCovarianceFunctionProduct(
GPCovarianceFunctionSquaredExponential(l=m.kern.product.rbf.lengthscales.value,
sigma=np.sqrt(m.kern.product.rbf.variance.value)),
GPCovarianceFunctionExpSine2Kernel(l=m.kern.product.periodic.lengthscales.value,
period=m.kern.product.periodic.period.value,
sigma=np.sqrt(m.kern.product.periodic.variance.value)))
k4 = GPCovarianceFunctionSum(GPCovarianceFunctionSquaredExponential(l=m.kern.rbf_2.lengthscales.value,
sigma=np.sqrt(m.kern.rbf_2.variance.value)),
GPCovarianceFunctionWhiteNoise(sigma=np.sqrt(m.kern.white.variance.value)))
k = GPCovarianceFunctionSum(k1, GPCovarianceFunctionSum(k2, k4))
y_pred, sigmas = predict1(xx, x, y, cov=[k1, k2, k4], sigma_n=np.sqrt(m.likelihood.variance.value))
And if we then plot m1 (blue) and m1+m2 (green) and the combined/full model m (red) in the same graph we get the picture below.
p = bokeh.plotting.figure(title="Fit to the Mauna Loa Data", #x_axis_type='datetime',
plot_width=900, plot_height=600)
p.yaxis.axis_label = 'CO2 [ppm]'
p.xaxis.axis_label = 'Date'
# total fit
p.line(xx, y_pred[0], line_width=1, line_color="firebrick", legend="Total fit")
# trend
p.line(xx, y_pred[1], line_width=1, line_color="blue", legend="Long term trend")
# trend + seasonal variation
p.line(xx, y_pred[1] + y_pred[2], line_width=1, line_color="green", legend="Long term trend + seasonal variation")
# true value
p.circle(x, y, color="black", legend="Observed data")
p.legend.location = "top_left"
bokeh.plotting.show(p)
Please pay attention to the very wiggly red line, which is the full model fit. If you scroll up a bit and have a look at the fit of the full model produced by GPflow itself (in the plot function of this notebook via the GPflow predict_y()
method) you will not see this wiggly pattern. Let's plot the m1+m2 (green), the m from the hand created predictions (red) and the m as generated by GPflow (blue) in one graph to see the differences.
gpflow_mean, _ = m.predict_y(xx.reshape(-1,1))
p = bokeh.plotting.figure(title="Fit to the Mauna Loa Data", #x_axis_type='datetime',
plot_width=900, plot_height=600)
p.yaxis.axis_label = 'CO2 [ppm]'
p.xaxis.axis_label = 'Date'
# total fit
p.line(xx, y_pred[0], line_width=1, line_color="firebrick", legend="Total fit by hand")
# total fit as produced by GPflow
p.line(xx, gpflow_mean[:,0], line_width=1, line_color="blue", legend="Total fit by GPflow")
# trend + seasonal variation
p.line(xx, y_pred[1] + y_pred[2], line_width=1, line_color="green", legend="Long term trend + seasonal variation")
p.legend.location = "top_left"
bokeh.plotting.show(p)
As you can see the difference is not big, but I am not entirely sure where this is coming from. Perhaps GPflow ignores white noise in its predictions (which actually makes sense)!
If we then want to look at the seasonal variation (k2) and the short term variation (k4) we can do that as follows:
p = bokeh.plotting.figure(title="Fit to the Mauna Loa Data", #x_axis_type='datetime',
plot_width=900, plot_height=600)
p.yaxis.axis_label = 'CO2 [ppm]'
p.xaxis.axis_label = 'Date'
# total fit
p.line(xx, y_pred[2], line_width=1, line_color="firebrick", legend="Seasonal variation")
# trend
p.line(xx, y_pred[3], line_width=1, line_color="blue", legend="Short term variation")
# true value
p.legend.location = "top_left"
bokeh.plotting.show(p)
If we want to see how the seasonal variation changed over the years we can do that as follows:
# plot several years
def dates_to_idx(timelist):
reference_time = pd.to_datetime('1959-01-01')
t = (timelist - reference_time) / pd.Timedelta(1, "Y")
return np.asarray(t)+1959
p = bokeh.plotting.figure(title="Several years of the seasonal component",
plot_width=900, plot_height=600)
p.yaxis.axis_label = 'Δ CO2 [ppm]'
p.xaxis.axis_label = 'Month'
colors = bokeh.palettes.brewer['Paired'][5]
years = ["1960", "1970", "1980", "1990", "2000"]
# xx = np.linspace(1990, 2000, 1000)
#
for i, year in enumerate(years):
dates = pd.date_range(start="1/1/"+year, end="12/31/"+year, freq="10D")
tnew = dates_to_idx(dates)
print("Predicting year", year)
y_pred_, sigmas_ = predict1(tnew, x, y, cov=[k1, k2, k4], sigma_n=np.sqrt(m.likelihood.variance.value))
# plot mean
t = np.asarray((dates - dates[0])/pd.Timedelta(1, "M")) + 1
p.line(t, y_pred_[2], line_width=1, line_color=colors[i], legend=year)
p.legend.location = "bottom_left"
bokeh.plotting.show(p)
Predicting year 1960 Predicting year 1970 Predicting year 1980 Predicting year 1990 Predicting year 2000
Here is an alternative implementation of the decomposition based on a comment in Example that shows how to separate additive effects, e.g. time series decomposition of birthdays data
XX = xx.reshape(-1, 1)
_x_new = XX
_x = X
_y = Y
_A = k_.compute_K_symm(_x_new)
_C = k_.compute_K(_x_new, _x)
_B = k_.compute_K_symm(_x) + m.likelihood.variance.value*np.diag(np.ones(len(_x)))
_mu1 = np.linalg.inv(_B).dot(_C.T).T.dot(_y).squeeze()
_tmp1 = np.linalg.inv(_B).dot(_C.T)
_tmp2 = np.linalg.solve(_B, _C.T)
'{:.10f}'.format(np.abs(_tmp1 - _tmp2).max())
'0.0000003427'
_mu2 = np.linalg.solve(_B,_C.T).T.dot(_y).squeeze()
'{:.10f}'.format(np.abs(_mu1 - _mu2).max())
'0.0000001639'
_mu3 = _C.dot(np.linalg.inv(_B)).dot(_y).squeeze()
'{:.10f}'.format(np.abs(_mu1 - _mu3).max())
'0.0000001528'
_mu4 = _C.dot(np.linalg.solve(_B, _y)).squeeze()
'{:.10f}'.format(np.abs(_mu1 - _mu4).max())
'0.0000001638'
def conditional2(x_new, x, y, cov, sigma_n=0):
if not isinstance(cov, list):
cov = [cov]
if len(cov) < 2:
total_covariance_function = cov[0]
else:
# total_covariance_function = None
total_covariance_function = functools.reduce(lambda a, x: a + x, cov)
A = total_covariance_function.compute_K_symm(x_new)
C = total_covariance_function.compute_K(x_new, x)
B = total_covariance_function.compute_K_symm(x) + np.power(sigma_n,2)*np.diag(np.ones(len(x)))
mu = [np.linalg.inv(B).dot(C.T).T.dot(y).squeeze()]
sigma = [(A - C.dot(np.linalg.inv(B).dot(C.T))).squeeze()]
for i in range(0, len(cov)):
partial_covariance_function = cov[i]
C_ = partial_covariance_function.compute_K(x_new, x)
mu_ = np.linalg.inv(B).dot(C_.T).T.dot(y).squeeze()
mu.append(mu_)
return (mu, sigma)
def predict2(x_new, x, y, cov, sigma_n=0):
l_y_pred, l_sigmas = conditional2(x_new, x, y, cov=cov, sigma_n=sigma_n)
if len(l_sigmas[0].shape) > 1:
return l_y_pred, [np.diagonal(ls) for ls in l_sigmas]
else:
return l_y_pred, l_sigmas
y_pred2, sigmas2 = predict2(XX, X, Y, cov=[k1_, k2_, k4_], sigma_n=np.sqrt(m.likelihood.variance.value))
len(y_pred[0])
1000
np.allclose(y_pred[0], y_pred2[0])
False
df = pd.DataFrame(np.array([y_pred[0],y_pred2[0]]).T, columns=['np','tf'])
df['delta'] = np.abs(df['np'] - df['tf'])
df['%'] = df['delta'] / df['np'] * 100.0
with pd.option_context('display.max_rows', None, 'display.max_columns', 3):
display_html(df.head().to_html(),raw=True)
np | tf | delta | % | |
---|---|---|---|---|
0 | 353.433528 | 353.491490 | 0.057962 | 0.016400 |
1 | 353.790253 | 353.593367 | 0.196886 | 0.055650 |
2 | 353.594593 | 353.691103 | 0.096510 | 0.027294 |
3 | 353.719705 | 353.784908 | 0.065203 | 0.018434 |
4 | 353.803838 | 353.875188 | 0.071350 | 0.020166 |
df[['delta','%']].max()
delta 0.413584 % 0.115987 dtype: float64
import GPy
k1__ = GPy.kern.RBF (input_dim=1, variance=float(m.kern.rbf_1.variance.value) , lengthscale=float(m.kern.rbf_1.lengthscales.value))
k2__1 = GPy.kern.RBF (input_dim=1, variance=float(m.kern.product.rbf.variance.value) , lengthscale=float(m.kern.product.rbf.lengthscales.value))
k2__2 = GPy.kern.StdPeriodic(input_dim=1, variance=float(m.kern.product.periodic.variance.value), lengthscale=float(m.kern.product.periodic.lengthscales.value),period=float(m.kern.product.periodic.period.value))
k2__ = k2__1 * k2__2
k4__1 = GPy.kern.RBF (input_dim=1, variance=float(m.kern.rbf_2.variance.value) , lengthscale=float(m.kern.rbf_2.lengthscales.value))
k4__2 = GPy.kern.White (input_dim=1, variance=float(m.kern.white.variance.value))
k4__ = k4__1 + k4__2
k__ = k1__ + k2__ + k4__
m__ = GPy.models.GPRegression(X,Y,k__)
m__.Gaussian_noise.variance[0] = float(m.likelihood.variance.value)
m__
/home/user/cs/local/install/Anaconda3-5.1.0-Linux-x86_64/lib/python3.6/site-packages/paramz/transformations.py:111: RuntimeWarning:overflow encountered in expm1
Model: GP regression
Objective: 129.03256335476846
Number of Parameters: 11
Number of Optimization Parameters: 11
Updates: True
GP_regression. | value | constraints | priors |
---|---|---|---|
sum.rbf.variance | 4356.219071780713 | +ve | |
sum.rbf.lengthscale | 48.565715481551585 | +ve | |
sum.mul.rbf.variance | 5.64144286809577 | +ve | |
sum.mul.rbf.lengthscale | 91.88714901784704 | +ve | |
sum.mul.std_periodic.variance | 1.026393528093981 | +ve | |
sum.mul.std_periodic.period | 0.9997023123140168 | +ve | |
sum.mul.std_periodic.lengthscale | 0.6709642212683404 | +ve | |
sum.rbf_1.variance | 0.16043370929892875 | +ve | |
sum.rbf_1.lengthscale | 0.32753755974365617 | +ve | |
sum.white.variance | 0.03193821948646309 | +ve | |
Gaussian_noise.variance | 0.010237414042966209 | +ve |
def conditional3(x_new, x, y, cov, sigma_n=0):
if not isinstance(cov, list):
cov = [cov]
if len(cov) < 2:
total_covariance_function = cov[0]
else:
# total_covariance_function = None
total_covariance_function = functools.reduce(lambda a, x: a + x, cov)
A = total_covariance_function.K(x_new) # total_covariance_function.K(x_new,x_new)
C = total_covariance_function.K(x_new, x)
B = total_covariance_function.K(x) + np.power(sigma_n,2)*np.diag(np.ones(len(x))) # total_covariance_function.K(x,x)
mu = [np.linalg.inv(B).dot(C.T).T.dot(y).squeeze()]
sigma = [(A - C.dot(np.linalg.inv(B).dot(C.T))).squeeze()]
for i in range(0, len(cov)):
partial_covariance_function = cov[i]
C_ = partial_covariance_function.K(x_new, x)
mu_ = np.linalg.inv(B).dot(C_.T).T.dot(y).squeeze()
mu.append(mu_)
# kxX = C_
# K_1 = np.linalg.inv(B)
# mean = C_.dot(np.linalg.inv(B)).dot(Y)
return (mu, sigma)
def predict3(x_new, x, y, cov, sigma_n=0):
l_y_pred, l_sigmas = conditional3(x_new, x, y, cov=cov, sigma_n=sigma_n)
if len(l_sigmas[0].shape) > 1:
return l_y_pred, [np.diagonal(ls) for ls in l_sigmas]
else:
return l_y_pred, l_sigmas
y_pred3, sigmas3 = predict3(XX, X, Y, cov=[k1__, k2__, k4__], sigma_n=np.sqrt(m__.Gaussian_noise.variance[0]))
/home/user/cs/local/install/Anaconda3-5.1.0-Linux-x86_64/lib/python3.6/site-packages/paramz/transformations.py:111: RuntimeWarning:overflow encountered in expm1
np.allclose(y_pred[0], y_pred3[0])
False
df = pd.DataFrame(np.array([y_pred[0],y_pred2[0],y_pred3[0]]).T, columns=['np','gpflow','gpy'])
df['delta_gpflow'] = np.abs(df['np'] - df['gpflow'])
df['%_gpflow'] = df['delta_gpflow'] / df['np'] * 100.0
df['delta_gpy'] = np.abs(df['np'] - df['gpy'])
df['%_gpy'] = df['delta_gpy'] / df['np'] * 100.0
df['delta_gpflow_gpy'] = np.abs(df['gpflow'] - df['gpy'])
df['%_gpflow_gpy'] = df['delta_gpflow_gpy'] / df['gpflow'] * 100.0
df.head()
np | gpflow | gpy | delta_gpflow | %_gpflow | delta_gpy | %_gpy | delta_gpflow_gpy | %_gpflow_gpy | |
---|---|---|---|---|---|---|---|---|---|
0 | 353.433528 | 353.491490 | 353.491490 | 0.057962 | 0.016400 | 0.057962 | 0.016400 | 7.407044e-08 | 2.095395e-08 |
1 | 353.790253 | 353.593367 | 353.593367 | 0.196886 | 0.055650 | 0.196886 | 0.055650 | 2.133197e-08 | 6.032910e-09 |
2 | 353.594593 | 353.691103 | 353.691102 | 0.096510 | 0.027294 | 0.096510 | 0.027294 | 1.523574e-07 | 4.307639e-08 |
3 | 353.719705 | 353.784908 | 353.784908 | 0.065203 | 0.018434 | 0.065203 | 0.018434 | 3.971718e-08 | 1.122636e-08 |
4 | 353.803838 | 353.875188 | 353.875188 | 0.071350 | 0.020166 | 0.071350 | 0.020166 | 4.792372e-08 | 1.354255e-08 |
df[['delta_gpflow','%_gpflow','delta_gpy','%_gpy','delta_gpflow_gpy','%_gpflow_gpy']].max()
delta_gpflow 4.135843e-01 %_gpflow 1.159868e-01 delta_gpy 4.135842e-01 %_gpy 1.159867e-01 delta_gpflow_gpy 2.924092e-07 %_gpflow_gpy 8.092542e-08 dtype: float64
k1__ = GPy.kern.RBF (input_dim=1, variance=float(m.kern.rbf_1.variance.value) , lengthscale=float(m.kern.rbf_1.lengthscales.value))
k2__1 = GPy.kern.RBF (input_dim=1, variance=float(m.kern.product.rbf.variance.value) , lengthscale=float(m.kern.product.rbf.lengthscales.value))
k2__2 = GPy.kern.StdPeriodic(input_dim=1, variance=float(m.kern.product.periodic.variance.value), lengthscale=float(m.kern.product.periodic.lengthscales.value),period=float(m.kern.product.periodic.period.value))
k2__ = k2__1 * k2__2
k4__1 = GPy.kern.RBF (input_dim=1, variance=float(m.kern.rbf_2.variance.value) , lengthscale=float(m.kern.rbf_2.lengthscales.value))
k4__2 = GPy.kern.White (input_dim=1, variance=float(m.kern.white.variance.value))
k4__ = k4__1 + k4__2
k__ = k1__ + k2__ + k4__
m__ = GPy.models.GPRegression(X,Y,k__)
m__.Gaussian_noise.variance[0] = float(m.likelihood.variance.value)
m__.optimize()
m__
/home/user/cs/local/install/Anaconda3-5.1.0-Linux-x86_64/lib/python3.6/site-packages/paramz/transformations.py:111: RuntimeWarning:overflow encountered in expm1
Model: GP regression
Objective: 129.03256333921234
Number of Parameters: 11
Number of Optimization Parameters: 11
Updates: True
GP_regression. | value | constraints | priors |
---|---|---|---|
sum.rbf.variance | 4356.219071782721 | +ve | |
sum.rbf.lengthscale | 48.5657154836552 | +ve | |
sum.mul.rbf.variance | 5.641442868106639 | +ve | |
sum.mul.rbf.lengthscale | 91.88714903815146 | +ve | |
sum.mul.std_periodic.variance | 1.0263935281187573 | +ve | |
sum.mul.std_periodic.period | 0.9997023902939737 | +ve | |
sum.mul.std_periodic.lengthscale | 0.6709642212192272 | +ve | |
sum.rbf_1.variance | 0.16043370928666736 | +ve | |
sum.rbf_1.lengthscale | 0.32753755952069186 | +ve | |
sum.white.variance | 0.03193821950509535 | +ve | |
Gaussian_noise.variance | 0.010237414044922417 | +ve |
k1__ = GPy.kern.RBF (input_dim=1, variance=float(m.kern.rbf_1.variance.value) , lengthscale=float(m.kern.rbf_1.lengthscales.value))
k2__1 = GPy.kern.RBF (input_dim=1, variance=float(m.kern.product.rbf.variance.value) , lengthscale=float(m.kern.product.rbf.lengthscales.value))
k2__2 = GPy.kern.StdPeriodic(input_dim=1, variance=float(m.kern.product.periodic.variance.value), lengthscale=float(m.kern.product.periodic.lengthscales.value),period=float(m.kern.product.periodic.period.value))
k2__ = k2__1 * k2__2
k4__1 = GPy.kern.RBF (input_dim=1, variance=float(m.kern.rbf_2.variance.value) , lengthscale=float(m.kern.rbf_2.lengthscales.value))
k4__2 = GPy.kern.White (input_dim=1, variance=float(m.kern.white.variance.value))
k4__ = k4__1 + k4__2
k__ = k1__ + k2__ + k4__
m__ = GPy.models.GPRegression(X,Y,k__)
m__.Gaussian_noise.variance[0] = float(m.likelihood.variance.value)
m__.optimize(optimizer='scg')
m__
/home/user/cs/local/install/Anaconda3-5.1.0-Linux-x86_64/lib/python3.6/site-packages/paramz/transformations.py:111: RuntimeWarning:overflow encountered in expm1
Model: GP regression
Objective: 129.03256333607948
Number of Parameters: 11
Number of Optimization Parameters: 11
Updates: True
GP_regression. | value | constraints | priors |
---|---|---|---|
sum.rbf.variance | 4356.219071782721 | +ve | |
sum.rbf.lengthscale | 48.56571548365474 | +ve | |
sum.mul.rbf.variance | 5.641442868106637 | +ve | |
sum.mul.rbf.lengthscale | 91.88714903814706 | +ve | |
sum.mul.std_periodic.variance | 1.026393528118752 | +ve | |
sum.mul.std_periodic.period | 0.9997023902770954 | +ve | |
sum.mul.std_periodic.lengthscale | 0.6709642212192378 | +ve | |
sum.rbf_1.variance | 0.16043370928667006 | +ve | |
sum.rbf_1.lengthscale | 0.32753755952074015 | +ve | |
sum.white.variance | 0.03193821950509133 | +ve | |
Gaussian_noise.variance | 0.01023741404492199 | +ve |
def conditional(X_new, X, Y, cov, sigma_n=0, framework='gpy'):
if not isinstance(cov, list):
cov = [cov]
if len(cov) < 2:
total_covariance_function = cov[0]
else:
# total_covariance_function = None
total_covariance_function = functools.reduce(lambda a, x: a + x, cov)
if framework == 'gpflow':
A = total_covariance_function.compute_K_symm(X_new)
C = total_covariance_function.compute_K(X_new, X)
B = total_covariance_function.compute_K_symm(X) + np.power(sigma_n,2)*np.diag(np.ones(len(X)))
elif framework == 'gpy':
A = total_covariance_function.K(X_new) # total_covariance_function.K(x_new,x_new)
C = total_covariance_function.K(X_new, X)
B = total_covariance_function.K(X) + np.power(sigma_n,2)*np.diag(np.ones(len(X))) # total_covariance_function.K(x,x)
else:
raise ValueError('Unknown framework parameter: {}'.format(framework))
# mu = [np.linalg.inv(B).dot(C.T).T.dot(y).squeeze()]
BY = np.linalg.solve(B, Y)
mu = [C.dot(BY).squeeze()]
sigma = [(A - C.dot(np.linalg.inv(B).dot(C.T))).squeeze()]
for i in range(0, len(cov)):
partial_covariance_function = cov[i]
C_ = None
if framework == 'gpflow':
C_ = partial_covariance_function.compute_K(X_new, X)
elif framework == 'gpy':
C_ = partial_covariance_function.K(X_new, X)
else:
raise ValueError('Unknown framework parameter: {}'.format(framework))
# mu_ = np.linalg.inv(B).dot(C_.T).T.dot(y).squeeze()
mu_ = C_.dot(BY).squeeze()
mu.append(mu_)
return (mu, sigma)
def predict(X_new, X, Y, cov, sigma_n=0, framework='gpy'):
l_y_pred, l_sigmas = conditional(X_new, X, Y, cov=cov, sigma_n=sigma_n, framework=framework)
if len(l_sigmas[0].shape) > 1:
return l_y_pred, [np.diagonal(ls) for ls in l_sigmas]
else:
return l_y_pred, l_sigmas
with warnings.catch_warnings():
warnings.filterwarnings("ignore")
y_pred2, sigmas2 = predict(XX, X, Y, cov=[k1_ , k2_ , k4_] , sigma_n=np.sqrt(m.likelihood.variance.value) , framework='gpflow')
y_pred3, sigmas3 = predict(XX, X, Y, cov=[k1__, k2__, k4__], sigma_n=np.sqrt(m__.Gaussian_noise.variance[0]), framework='gpy')
'{:.10f}'.format(np.abs(y_pred2[0] - y_pred3[0]).max())
'0.0000001568'