# Manifold Learning with Isomap¶


This tour explores the Isomap algorithm for manifold learning.

The <http://waldron.stanford.edu/~isomap/ Isomap> algorithm is introduced in

A Global Geometric Framework for Nonlinear Dimensionality Reduction, J. B. Tenenbaum, V. de Silva and J. C. Langford, Science 290 (5500): 2319-2323, 22 December 2000.

In [1]:
using PyPlot
using NtToolBox


## Graph Approximation of Manifolds¶

Manifold learning consist in approximating the parameterization of a manifold represented as a point cloud.

First we load a simple 3D point cloud, the famous Swiss Roll.

Number of points.

In [2]:
n = 500; #1000 points leads to very slow computation


Random position on the parameteric domain.

In [3]:
x = rand(2,n);


Mapping on the manifold.

In [4]:
v = 3*pi/2*(.1 + 2*x[1,:])
X  = zeros(3,n)
X[2,:] = 20*x[2,:]
X[1,:] = - cos(v).*v
X[3,:] = sin(v).*v;


Parameter for display.

In [5]:
ms = 200
elev = 20; azim = -110;


Display the point cloud.

In [6]:
fig = figure(figsize=(15,11))
ax = gca(projection="3d")

#swiss roll
scatter3D(X[1,:], X[2,:], X[3,:], c=get_cmap("jet")((X[1,:].^2+X[3,:].^2)/100), s=ms, lw=0, alpha=1)

#params
xlim(minimum(X[1,:]),maximum(X[1,:]))
ylim(minimum(X[2,:]),maximum(X[2,:]))
zlim(minimum(X[3,:]),maximum(X[3,:]))
axis("off")
ax[:view_init](elev, azim)


Compute the pairwise Euclidean distance matrix.

In [7]:
D1 = repeat(sum(X.^2, 1), outer=(n,1))
D1 = D1 + D1' - 2*X'*X
D1[D1.<0] = NaN
D1=sqrt(D1);


Number of NN for the graph.

In [8]:
k = 6;


Compute the k-NN connectivity.

In [9]:
DNN, NN = zeros(size(D1)), zeros(size(D1))
for i in 1:size(D1,2)
DNN[i,:], NN[i,:] = sort(D1[i,:]), sortperm(D1[i,:])
end
NN = Int.(NN[:,2:k+1])
DNN = DNN[:,2:k+1];


In [10]:
B = repeat((1:n)', outer=(k,1))
A = sparse(vec(B), vec(NN'), ones(k*n));


Weighted adjacency (the metric on the graph).

In [11]:
W = sparse(vec(B),vec(NN'), vec(DNN'));


Display the graph.

In [12]:
fig = figure(figsize=(15,11))
ax = gca(projection="3d")

#swiss roll
scatter3D(X[1,:], X[2,:], X[3,:], c=get_cmap("jet")((X[1,:].^2+X[3,:].^2)/100), s=ms, lw=0, alpha=1)

#graph
(I,J) = findn(A)
#(I,J) = (vec(B), vec(NN))
xx = hcat(X[1,I],X[1,J])
yy = hcat(X[2,I],X[2,J])
zz = hcat(X[3,I],X[3,J])

for i in 1:length(I)
plot3D(xx[i,:], yy[i,:], zz[i,:], color="black")
end

#params
xlim(minimum(X[1,:]),maximum(X[1,:]))
ylim(minimum(X[2,:]),maximum(X[2,:]))
zlim(minimum(X[3,:]),maximum(X[3,:]))
axis("off")
ax[:view_init](elev, azim)


## Floyd Algorithm to Compute Pairwise Geodesic Distances¶

A simple algorithm to compute the geodesic distances between all pairs of points on a graph is Floyd iterative algorithm. Its complexity is $\mathcal O(n^3)$ where $n$ is the number of points. It is thus quite slow for sparse graph, where Dijkstra runs in $\mathcal O(n^2\log(n))$.

Floyd algorithm iterates the following update rule, for $k=1,\dots,n$

$D(i,j) \leftarrow \min(D(i,j), D(i,k)+D(k,j))$,

with the initialization $D(i,j)=W(i,j)$ if $W(i,j)>0$, and $D(i,j)=Inf$ if $W(i,j)=0$.

Make the graph symmetric.

In [13]:
#W = readdlm("W.txt")
D = full(W)
D = (D + D')/2.;


Initialize the matrix.

In [14]:
D[D .== 0] = Inf;


Add connexion between a point and itself.

In [15]:
D = D - diagm(diag(D))
D[isnan(D)] = Inf;


Exercise 1

Implement the Floyd algorithm to compute the full distance matrix $D$, where $D(i,j)$ is the geodesic distance between

In [16]:
include("NtSolutions/shapes_7_isomap/exo1.jl")

In [17]:
## Insert your code here.


Find index of vertices that are not connected to the main manifold.

In [18]:
Iremove = find(D[:,1].==Inf);


Remove Inf remaining values (disconnected components).

In [19]:
D[D .== Inf] = 0;


## Isomap with Classical Multidimensional Scaling¶

Isomap perform the dimensionality reduction by applying multidimensional scaling.

Please refer to the tours on Bending Invariant for detail on Classical MDS (strain minimization).

Exercise 2

Perform classical MDS to compute the 2D flattening.

In [20]:
include("NtSolutions/shapes_7_isomap/exo2.jl");