In this notebook, we'll use Autograd to do some simple curve fitting. First, set up plotting:
import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline
Next, import autograd as follows (note: do not import numpy directly):
import autograd.numpy as np
from autograd import grad
Some constants:
samples = 10
degree = 3
learning_rate = 0.1
iterations = 100
Create some data:
xs = np.cumsum(np.random.uniform(0.1, 0.2, (samples,)))
ys = np.random.uniform(0., 1., (samples,))
print(xs)
print(ys)
[ 0.13371878 0.27553397 0.39132129 0.50835769 0.65905676 0.81731953 1.00509184 1.20073021 1.31855136 1.50497547] [ 0.15309926 0.62671086 0.85887213 0.69791785 0.65602031 0.56616744 0.97026765 0.88805564 0.70586052 0.01912681]
plt.plot(xs, ys, 'o')
[<matplotlib.lines.Line2D at 0x113ec6dd8>]
We want to fit a polynomial $y = a_n x^n + \cdots + a_1 x + a_0$ to this data. We define the polynomial as a function:
def poly(a, x):
y = 0
for i in range(len(a)):
y += a[i] * x**i
return y
Start with a random polynomial:
a = np.random.uniform(-5., 5., (degree+1,))
a
array([ 4.47146714, -4.94676203, -1.6694301 , 0.88959485])
pxs = np.linspace(xs.min(), xs.max())
plt.plot(xs, ys, 'o', pxs, poly(a, pxs), '-')
[<matplotlib.lines.Line2D at 0x113ff3208>, <matplotlib.lines.Line2D at 0x113ff3390>]
Now we define the error function that we want to minimize. Note that the first argument to the function is the one that we're going to learn.
def mse(a, xs, ys):
err = 0
for x, y in zip(xs, ys):
err += (poly(a, x) - y) ** 2
return err / len(xs)
mse(a, xs, ys)
6.5037768171627786
Using Autograd is usually as simple as:
g_mse = grad(mse)
This gives a new function which is the gradient of the error function with respect to its first argument.
g_mse(a, xs, ys)
array([-1.01988619, -2.98285414, -4.29046931, -5.8087499 ])
Note that its return value has the same shape as a
. Now we do gradient descent:
for i in range(iterations):
a -= learning_rate * g_mse(a, xs, ys)
print(mse(a, xs, ys))
plt.plot(xs, ys, 'o', pxs, poly(a, pxs), '-')
0.523648965712
[<matplotlib.lines.Line2D at 0x11432b748>, <matplotlib.lines.Line2D at 0x11432b898>]
There's a number of things that you can't do with Autograd that are outlined in the official tutorial. Here are some particular things to watch out for.
Use autograd.numpy
, not numpy
. When you differentiate a function, Autograd calls your function with the variables that you are differentiating with respect to (henceforth, the parameters) replaced with Node
objects (usually FloatNode
or ArrayNode
). If something in your function doesn't know how to handle a Node
, you'll get a (usually weird) error.
All Autograd functions know how to deal with Node
objects; other functions don't. So, don't use functions from numpy
or scipy
on parameters; use their equivalents provided by Autograd (autograd.numpy
and autograd.scipy
). Likewise, don't use math
; usually autograd.numpy
has an equivalent function.
Otherwise, you'll get an error. The exact error depends on the function used.
import numpy
import traceback # for handling exception without crashing notebook
def f(x):
return numpy.sum(x*x)
g = grad(f)
try:
g(numpy.array([1., 2., 3.]))
except Exception as e:
traceback.print_exc()
Traceback (most recent call last): File "/Users/david/Documents/Teaching/NLP/python/lib/python3.5/site-packages/autograd-1.1.6-py3.5.egg/autograd/core.py", line 31, in forward_pass try: end_node = fun(*args, **kwargs) File "<ipython-input-13-39373a1ee8dd>", line 5, in f return numpy.sum(x*x) File "/Users/david/Documents/Teaching/NLP/python/lib/python3.5/site-packages/numpy/core/fromnumeric.py", line 1837, in sum return sum(axis=axis, dtype=dtype, out=out) File "/Users/david/Documents/Teaching/NLP/python/lib/python3.5/site-packages/autograd-1.1.6-py3.5.egg/autograd/core.py", line 138, in __call__ gradfun = self.gradmaker(argnum, result, args, kwargs) File "/Users/david/Documents/Teaching/NLP/python/lib/python3.5/site-packages/autograd-1.1.6-py3.5.egg/autograd/core.py", line 99, in gradmaker return self.grads[argnum](ans, *args, **kwargs) TypeError: make_grad_np_sum() got an unexpected keyword argument 'out' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "<ipython-input-13-39373a1ee8dd>", line 9, in <module> g(numpy.array([1., 2., 3.])) File "/Users/david/Documents/Teaching/NLP/python/lib/python3.5/site-packages/autograd-1.1.6-py3.5.egg/autograd/core.py", line 22, in gradfun return backward_pass(*forward_pass(fun,args,kwargs,argnum)) File "/Users/david/Documents/Teaching/NLP/python/lib/python3.5/site-packages/autograd-1.1.6-py3.5.egg/autograd/core.py", line 32, in forward_pass except Exception as e: add_extra_error_message(e) File "/Users/david/Documents/Teaching/NLP/python/lib/python3.5/site-packages/autograd-1.1.6-py3.5.egg/autograd/core.py", line 409, in add_extra_error_message extra_message = check_common_errors(type(e), str(e)) File "/Users/david/Documents/Teaching/NLP/python/lib/python3.5/site-packages/autograd-1.1.6-py3.5.egg/autograd/core.py", line 405, in check_common_errors return vals[matches.index(True)] AttributeError: 'map' object has no attribute 'index'
Instead, import autograd.numpy as numpy
quietly replaces all NumPy functions with their Autograd equivalents:
import autograd.numpy as numpy
def f(x):
return numpy.sum(x*x)
g = grad(f)
g(numpy.array([1., 2., 3.]))
array([ 2., 4., 6.])
Use arrays, not lists. Most NumPy functions that expect arrays will quietly convert lists (or any other kind of sequence) into arrays. But (some of) Autograd's functions don't do this; it would be too expensive for them to go digging through lists to look for Node
objects.
def f(x, indices):
xi = [x[i] for i in indices] # xi is a list
try:
return numpy.log(numpy.sum(numpy.exp(xi)))
except:
traceback.print_exc()
g = grad(f)
g(numpy.array([1., 2., 3.]), [0, 2])
Traceback (most recent call last): File "<ipython-input-15-8e171c8c11f7>", line 4, in f return numpy.log(numpy.sum(numpy.exp(xi))) File "/Users/david/Documents/Teaching/NLP/python/lib/python3.5/site-packages/autograd-1.1.6-py3.5.egg/autograd/core.py", line 133, in __call__ result = self.fun(*argvals, **kwargs) AttributeError: 'FloatNode' object has no attribute 'exp' /Users/david/Documents/Teaching/NLP/python/lib/python3.5/site-packages/autograd-1.1.6-py3.5.egg/autograd/core.py:37: UserWarning: Output seems independent of input. Returning zero gradient. warnings.warn("Output seems independent of input. Returning zero gradient.")
array([ 0., 0., 0.])
Instead, you must convert lists to arrays:
def f(x, indices):
xi = [x[i] for i in indices] # xi is a list
xi = numpy.array(xi) # now xi is an array
try:
return numpy.log(numpy.sum(numpy.exp(xi)))
except:
traceback.print_exc()
g = grad(f)
g(numpy.array([1., 2., 3.]), [0, 2])
array([ 0.11920292, 0. , 0.88079708])
Arrays are faster than scalars. If your code performs a lot of scalar operations inside loops, it will work fine, but the gradient will be very slow. This is because Autograd creates a Node
for every single operation.
import timeit
def f(x, y):
z = 0
for i in range(x.shape[0]):
z += x[i]*y[i]
return z
g = grad(f)
def gxy():
x = numpy.arange(100).astype(float)
y = numpy.arange(100).astype(float)
return g(x, y)
print(timeit.timeit(gxy, number=10), 'seconds')
0.05185318301664665 seconds
Instead, rewrite loops using array operations:
def f(x, y):
return numpy.dot(x, y)
g = grad(f)
def gxy():
x = numpy.arange(100).astype(float)
y = numpy.arange(100).astype(float)
return g(x, y)
print(timeit.timeit(gxy, number=10), 'seconds')
0.0020726250077132136 seconds
Note that this ban on loops only applies to computations involving parameters. Loops that don't involve parameters don't affect Autograd's speed.