An intro to Julia with Autodiff in 25 minutes

Largely inspired from Prof. Alan Edelman's talk Automatic Differentiation in 10 minutes with Julia

1- Dual numbers

In linear algebra, the dual numbers extend the real numbers by adjoining one new element $\varepsilon$ (epsilon) with the property $\varepsilon^2 = 0$. Thus the multiplication of dual numbers is given by: $$(a+b\varepsilon )(c+d\varepsilon )=ac+(ad+bc)\varepsilon,$$ and addition is done componentwise: $$(a+b\varepsilon )+(c+d\varepsilon )=(a+c)+(b+d)\varepsilon.$$

Using Julia, we can create a data type for dual numbers as follows:

In [ ]:
struct Dual <: Number
    value::Float64
    epsilon::Float64
end

We now define addition, substraction and multiplication for dual numbers. To do this, we define a new method to the Julia Base operators +,-,*.

In [ ]:
import Base: +, -, *
+(z::Dual, w::Dual) = Dual(z.value+w.value, z.epsilon+w.epsilon)
-(z::Dual, w::Dual) = Dual(z.value-w.value, z.epsilon-w.epsilon)
*(z::Dual, w::Dual) = Dual(z.value*w.value, z.value*w.epsilon+z.epsilon*w.value)

For convenience, we also define a new display function for dual numbers.

In [ ]:
Base.show(io::IO,x::Dual) = print(io,x.value," + ",x.epsilon," ε")
In [ ]:
a=Dual(2,1)

We are now ready to start playing with dual numbers.

In [ ]:
a*a
In [ ]:
a^2
In [ ]:
a^7

This last two commands are rather surprising as we never define the power of a dual numbe! But it turns out that in Julia power is defined for any x supporting * as can be seen here or by following the link given by the following command:

In [ ]:
@which a^7
In [ ]:
2*a

To correct this problem, we need to:

  • convert the real 2 into the Dual(2,0)
  • tell Julia to make this conversion each time there is an expression implying dual numbers and reals

This is what is called conversion and promotion and here it can be done as follows:

In [ ]:
import Base: convert, promote_rule
convert(::Type{Dual}, x::Real) = Dual(x,zero(x))
promote_rule(::Type{Dual}, ::Type{<:Number}) = Dual
In [ ]:
2*a
In [ ]:
a-1
In [ ]:
b=Dual(1,1)
3*(a+b)^2

2- Automatic differentiation for polynomials

We can now get derivatives for polynomials. Consider $P(x) = p_0+p_1x+p_2x^2+\dots +p_n x^n$, note that since $\varepsilon^2 =0$ ($\varepsilon$ is nilpotent), we have $(x+\varepsilon y)^k = x^k + kx^{k-1}\varepsilon y$. Hence, we have $$ P(x+\varepsilon y) = P(x) + \left(p_1 + 2p_2 x+ \dots np_n x^{n-1}\right)y\varepsilon = P(x) + P'(x)y\varepsilon $$

More info on Wikipedia

In [ ]:
a
In [ ]:
3*(1+a)^2
In [ ]:
value(z::Dual) = z.value
epsilon(z::Dual) = z.epsilon
In [ ]:
epsilon(3*(1+a)^2)

We can now define a simple function to compute the derivative of a polynomials at a given point as follows:

In [ ]:
function derivative(f,x::Real)
    epsilon(f(Dual(x,1)))
end
In [ ]:
derivative(x->1+x+3x^2,1)

3- Going further and dealing with $\sqrt{}$

In [ ]:
a^(1/2)

To deal with this problem, we follow the Babylonian as described in this nice tutorial

Repeat $ t \leftarrow \frac{1}{2}\left(t+\frac{x}{t}\right)$ until $t$ converges to $\sqrt{x}$.

In [ ]:
function Babylonian(x; N = 10) 
    t = (1+x)/2
    for i = 2:N; t=(t+x/t)/2  end    
    t
end
In [ ]:
Babylonian(2), 2 # Type \sqrt+<tab> to get the symbol

The Babylonian algorithm uses only addition and division, hence we need to define a division for dual numbers: $$\frac{a+b\varepsilon}{c+d\varepsilon}= \frac{a}{c}\left(1+\frac{b}{a}\varepsilon\right)\left(1-\frac{d}{c}\varepsilon\right)= \frac{a}{c} +\frac{bc-ad}{c^2}\varepsilon.$$

In [ ]:
import Base:/
/(x::Dual, y::Dual) = Dual(x.value/y.value, (y.value*x.epsilon - x.value*y.epsilon)/y.value^2)
In [ ]:
a
In [ ]:
1/(1+a)
In [ ]:
Babylonian(a), 2, 0.5/√2
In [ ]:
derivative(x->Babylonian(x),2)
In [ ]:
function dBabylonian(x; N = 10) 
    t = (1+x)/2
    dt = 1/2
    for i = 1:N;  
        t = (t+x/t)/2; 
        dt = (dt+(t-x*dt)/t^2)/2; 
    end    
    dt
end
In [ ]:
x = 2; dBabylonian(x), .5/√x
In [ ]:
derivative(x->Babylonian(x)^4,4)
In [ ]:
derivative(x->(1+3*Babylonian(x))^3/Babylonian(x),2)
In [ ]:
using Pkg; Pkg.add("ForwardDiff")
using ForwardDiff
In [ ]:
ForwardDiff.derivative(sqrt, 2)
In [ ]:
ForwardDiff.derivative(Babylonian, 2)
In [ ]:
ForwardDiff.derivative(x->(1+3*sqrt(x))^3/sqrt(x),2)

For more on dual numbers, have a look at the (not active anymore) Julia package DualNumbers.jl

For more ressources dataflowr

In [ ]: