Title: Linear Threshold Units Author: Thomas M. Breuel Institution: UniKL
from pylab import *
from IPython.core.display import Image
def fig(x): return Image(filename="Figures/"+x+".png")
from pylab import *
def figs(*args):
for i,f in enumerate(args):
subplot(1,len(args),i+1)
axis("off")
fig = imshow(imread("Figures/"+f+".png"))
from scipy import linalg
Before embarking in more detail on learning algorithms, let's take a general look at neurons as classifiers or decision makers.
(McCulloch-Pitts Neurons)
Recall that we arrived at the McCulloch-Pitts model as a simplified model of the computation performed by neurons.
Component-wise notation:
$$ y_k = \phi(\sum_{j=1}^m w_{kj}x_j+x_0) $$Matrix notation:
$$ y = \phi(W \cdot x + x_0) $$(Linear Decisions)
For many kinds of problems, linear decision rules are common, and these map well onto McCulloch-Pitts neurons.
That is, rules of the form: accept if $x_1 + 0.7 x_2 \leq 1.1$ .
The reason is that a lot of decision rules are really rules based on costs, and costs are of the form $\hbox{quantity} \times \hbox{unit cost}$.
(Perceptrons)
Here is an example of this.
The decision rule is
$$D(x) = \left\lfloor {1 \choose 0.7} \cdot {x_1 \choose x_2} \leq 1.1 \right\rfloor$$This is obviously the same form as for a McCulloch-Pitts neuron with a Heaviside function as the nonlinearity.
These kinds of linear decision units are called perceptrons.
xs = rand(100,2)
ys = 1*(xs[:,0]+0.7*xs[:,1]>1.1)
plot(xs[ys==0,0],xs[ys==0,1],'b.')
plot(xs[ys==1,0],xs[ys==1,1],'r.')
plot([0,1.1/0.7],[1.1,0],color='green')
[<matplotlib.lines.Line2D at 0x1021a190>]
(linear threshold units as measures of similarity)
Assume both inputs and weights are normalized.
$$ ||w||=1, ||x||=1 $$The dot product becomes the cosine of the angle, the cosine similarity between the vectors $w$ and $x$:
$$ y_l = w\cdot x = cos \angle(w,x) $$Alternatively, we also get a relation to the Euclidean distance:
$$ ||w-x||^2 = w^2 - 2w\cdot x + x^2 = 2-2w\cdot x $$(pattern recognition)
Perceptrons therefore detect pattern as follows:
This is implicitly the basis of much of machine learning, and it explains why perceptrons and "neural networks" with McCulloch-Pitts neurons are so successful.