$\newcommand{\Reals}{\mathbb{R}} \newcommand{\Nats}{\mathbb{N}} \newcommand{\PDK}{{k}} \newcommand{\IS}{\mathcal{X}} \newcommand{\FM}{\Phi} \newcommand{\Gram}{G} \newcommand{\RKHS}{\mathcal{H}} \newcommand{\prodDot}[2]{\left\langle#1,#2\right\rangle} \DeclareMathOperator*{\argmin}{arg\,min} \DeclareMathOperator*{\argmax}{arg\,max}$
display(Image(filename="GP_uq.png", width=630)) #source: http://scikit-learn.org/0.17/modules/gaussian_process.html
display(Image(filename="SVM.png", width=700)) #source: https://en.wikipedia.org/wiki/Support_vector_machine
display(Image(filename="monomials_small.jpg", width=800)) #source: Berhard Schölkopf
and label $l_i = 0$ if $x_i$ generated by $p_0$, $l_i = 1$ otherwise
f = pl.figure(**figkw);
for (idx, c, marker) in [(0,'r', (0,3,0)), (1, "b", "x")]:
plt.scatter(*data[distr_idx==idx,:].T, c=c, marker=marker, alpha = 0.4)
pl.show()
(remember in $\Reals^2$ canonically: $\prodDot{a}{b} = a_1 b_1+a_2 b_2 $)
pl.figure(**figkw)
for (idx, c, marker) in [(0,'r', (0,3,0)), (1, "b", "x")]:
pl.scatter(*data[distr_idx==idx,:].T, c=c, marker=marker, alpha=0.2)
pl.arrow(0, 0, *data[distr_idx==idx,:].mean(0), head_width=0.3, width=0.05, head_length=0.3, fc=c, ec=c)
pl.title(r"Mean embeddings for $\Phi(x)=x$");
pl.figure(**figkw)
for (idx, c, marker) in [(0,'r', (0,3,0)), (1, "b", "x")]:
pl.scatter(*data[distr_idx==idx,:].T, c=c, marker=marker, alpha=0.2)
pl.arrow(0, 0, *data[distr_idx==idx,:].mean(0), head_width=0.3, width=0.05, head_length=0.3, fc=c, ec=c)
pl.title(r"Mean embeddings for $\Phi(x)=x$");
pl.scatter(np.ones(1), np.ones(1), c='k', marker='D', alpha=0.8);
est_dens_1 = dist.mixt(2, [dist.mvnorm(x, np.eye(2)*0.1) for x in data[:4]], [1./4]*4)
plot_with_contour(data, distr_idx,
lambda x: exp(est_dens_1.logpdf(x)),
colormesh_cmap=None, contour_classif=False)
est_dens_1 = dist.mixt(2, [dist.mvnorm(x, np.eye(2)*0.1,10) for x in data[:samps_per_distr]], [1./samps_per_distr]*samps_per_distr)
plot_with_contour(data, distr_idx,
lambda x: exp(est_dens_1.logpdf(x)),
colormesh_cmap=None, contour_classif=False)
are symmetric positive semidefinite, then $\PDK$ is a psd kernel
* can measure angles in the new space $$\prodDot{g}{f}_\RKHS = \cos(\angle[g,f])~\|g\|_\RKHS ~\|f\|_\RKHS$$
Plan
pick $\FM(x) = \PDK(\cdot, x)$
let linear combinations of features also be in $\RKHS$ $$f(\cdot)=\sum_{i=1}^m a_i \PDK(\cdot, x_i) \in \RKHS$$ for $a_i \in \Reals$
$\RKHS$ a vector space over $\Reals$ : if $f(\cdot)$ and $g(\cdot)$ functions from $\IS$ to $\Reals$, so are $a~f(\cdot)$ for $a \in \Reals, f(\cdot)+g(\cdot)$
$$\widehat{p}_0 = \frac{1}{N_0} \sum_{l_i = 0} \mathcal{N}(\cdot; x_i,\Sigma) ~~~~~~~~ \mu_{0} = \frac{1}{N_0} \sum_{l_i = 0} \PDK(\cdot, x_i)$$
if $\PDK$ is Gaussian density with covariance $\Sigma$
Lets look at example classification output
data, distr_idx = sklearn.datasets.make_circles(n_samples=400, factor=.3, noise=.05)
kc = KMEclassification(data[distr_idx==0,:], data[distr_idx==1,:], ro.LinearKernel())
plot_with_contour(data, distr_idx, kc.classification_score, 'Inner Product classif. '+"Linear", pl = plt, contour_classif = True, colormesh_cmap = pl.cm.bwr)
kc = KMEclassification(data[distr_idx==0,:], data[distr_idx==1,:], ro.GaussianKernel(0.3))
plot_with_contour(data, distr_idx, kc.classification_score, 'Inner Product classif. '+"Gaussian", pl = plt, contour_classif = True, colormesh_cmap = pl.cm.bwr)
called the maximum mean discrepancy (MMD)
An example
cme = ro.ConditionMeanEmbedding(inp_samps, out_samps, ro.GaussianKernel(0.3), ro.GaussianKernel(0.3), 5)
plot_mean_embedding(cme, inp_samps, out_samps, 0.3, 2.,)
HTML('<iframe width="560" height="315" src="https://www.youtube.com/embed/MpzaCCbX-z4?rel=0&showinfo=0&start=148" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe>')
display(Image(filename="Pendulum_eigenfunctions.png", width=700))
display(Image(filename="KeywordClustering.png", width=700))
Get slides at ingmarschuster.com