using Plots, ComplexPhasePortrait, ApproxFun, SingularIntegralEquations, DifferentialEquations
gr();
Dr. Sheehan Olver
We now introduce orthogonal polynomials (OPs). These are ___fundamental___ for computational mathematics, with applications in
We will investigate the properties of general OPs:
We will also investigate the properties of classical OPs:
A good reference is Digital Library of Mathematical Functions, Chapter 18.
This lecture we do the following:
Let p0(x),p1(x),p2(x),… be a sequence of polynomials such that pn(x) is exactly degree n, that is, pn(x)=knxn+O(xn−1)
Let w(x) be a continuous weight function on a (possibly infinite) interval (a,b): that is w(x)≥0 for all a<x<b. This induces an inner product ⟨f,g⟩:=∫baf(x)g(x)w(x)dx
Orthogonal polymomials are not unique: we can multiply each pn by a different nonzero constant ˜pn(x)=cnpn(x), and ˜pn will be orthogonal w.r.t. w. However, if we specify kn, this is sufficient to uniquely define them:
Proposition (Uniqueness of OPs I) Given a non-zero kn, there is a unique polynomial pn orthogonal w.r.t. w to all lower degree polynomials.
Proof Suppose rn(x)=knxn+O(xn−1) is another OP w.r.t. w. We want to show pn−rn is zero. But this is a polynomial of degree <n, hence pn(x)−rn(x)=n−1∑k=0ckpk(x)
Corollary (Uniqueness of OPs I) If qn and pn are orthogonal w.r.t. w to all lower degree polynomials, then qn(x)=Cpn(x) for some constant C.
If kn=1, that is, pn(x)=xn+O(xn−1)
Monic OPs are unique as we have specified kn.
If ‖pn‖=1, then we refer to the orthogonal polynomials as orthonormal w.r.t. w. We will usually use qn when they are orthonormal. Note it's not unique: we can multiply by ±1 without changing the norm.
Remark The classical OPs are not monic or orthonormal (apart from one case). Many people make the mistake of using orthonormal polynomials for computations. But there is a good reason to use classical OPs: their properties result in rational formulae, whereas orthonormal polynomials require square roots. This makes a big performance difference.
Classical orthogonal polynomials are orthogonal with respect to the following three weights:
Name | Interval (a,b) | Weight function w(x) | Standard polynomial | highest order coefficient kn |
---|---|---|---|---|
Hermite | (−∞,∞) | e−x2 | Hn(x) | 2n |
Laguerre | (0,∞) | xαe−x | L(α)n(x) | See Table 18.3.1 |
Jacobi | (−1,1) | (1−x)α(1+x)β | P(α,β)n(x) | See Table 18.3.1 |
Note out of convention the parameters for Jacobi polynomials are in the "wrong" order.
We can actually construct these polynomials in Julia: first consider Hermite:
H₀ = Fun(Hermite(), [1])
H₁ = Fun(Hermite(), [0,1])
H₂ = Fun(Hermite(), [0,0,1])
H₃ = Fun(Hermite(), [0,0,0,1])
H₄ = Fun(Hermite(), [0,0,0,0,1])
H₅ = Fun(Hermite(), [0,0,0,0,0,1])
xx = -4:0.01:4
plot(xx, H₀.(xx); label="H_0", ylims=(-400,400))
plot!(xx, H₁.(xx); label="H_1", ylims=(-400,400))
plot!(xx, H₂.(xx); label="H_2", ylims=(-400,400))
plot!(xx, H₃.(xx); label="H_3", ylims=(-400,400))
plot!(xx, H₄.(xx); label="H_4", ylims=(-400,400))
plot!(xx, H₅.(xx); label="H_5")
We verify their orthogonality:
w = Fun(GaussWeight(), [1.0])
@show sum(H₂*H₅*w) # means integrate
@show sum(H₅*H₅*w);
sum(H₂ * H₅ * w) = 0.0 sum(H₅ * H₅ * w) = 6806.222787477181
Now Jacobi:
α,β = 0.1,0.2
P₀ = Fun(Jacobi(β,α), [1])
P₁ = Fun(Jacobi(β,α), [0,1])
P₂ = Fun(Jacobi(β,α), [0,0,1])
P₃ = Fun(Jacobi(β,α), [0,0,0,1])
P₄ = Fun(Jacobi(β,α), [0,0,0,0,1])
P₅ = Fun(Jacobi(β,α), [0,0,0,0,0,1])
xx = -1:0.01:1
plot( xx, P₀.(xx); label="P_0^($α,$β)", ylims=(-2,2))
plot!(xx, P₁.(xx); label="P_1^($α,$β)")
plot!(xx, P₂.(xx); label="P_2^($α,$β)")
plot!(xx, P₃.(xx); label="P_3^($α,$β)")
plot!(xx, P₄.(xx); label="P_4^($α,$β)")
plot!(xx, P₅.(xx); label="P_5^($α,$β)")
w = Fun(JacobiWeight(β,α), [1.0])
@show sum(P₂*P₅*w) # means integrate
@show sum(P₅*P₅*w);
sum(P₂ * P₅ * w) = 2.1250362580715887e-17 sum(P₅ * P₅ * w) = 0.2171335824839316
There are special families of Jacobi weights with their own name.
Name | Jacobi parameters | Weight function w(x) | Standard polynomial | highest order coefficient kn |
---|---|---|---|---|
Jacobi | α,β | (1−x)α(1+x)β | P(α,β)n(x) | See Table 18.3.1 |
Legendre | 0,0 | 1 | Pn(x) | 2n(1/2)n/n! |
Chebyshev (first kind) | −12,−12 | 1√1−x2 | Tn(x) | 1(n=0),2n−1(n≠0) |
Chebyshev (second kind) | 12,12 | √1−x2 | Un(x) | 2n |
Ultraspherical | λ−12,λ−12 | (1−x2)λ−1/2,λ≠0 | C(λ)n(x) | 2n(λ)n/n! |
Note that other than Legendre, these polynomials have a different normalization than P(α,β)n:
T₂ = Fun(Chebyshev(), [0.0,0,1])
P₂ = Fun(Jacobi(-1/2,-1/2), [0.0,0,1])
plot(T₂; label="T_2", title="T_2 is C*P_2 for some C")
plot!(P₂; label="P_2")
But because they are orthogonal w.r.t. the same weight, they must be a constant multiple of each-other.
Chebyshev polynomials are pretty much the only OPs with simple closed form expressions.
Proposition (Chebyshev first kind formula) Tn(x)=cosnacosx
Proof We first show that they are orthogonal w.r.t. 1/√1−x2. Too easy: do x=cosθ, dx=−sinθ to get (for n≠m) ∫1−1cosnacosxcosmacosxdx√1−x2=−∫0πcosnθcosmθdθ=∫π0ei(−n−m)θ+ei(n−m)θ+ei(m−n)θ+ei(n+m)θ4dθ=0
We then need to show it has the right highest order term kn. Note that k0=k1=1. Using z=eiθ we see that cosnθ has a simple recurrence for n=2,3,…: cosnθ=zn+z−n2=2z+z−12zn−1+z1−n2−zn−2+z2−n2=2cosθcos(n−1)θ−cos(n−2)θ
⬛️
Proposition (Chebyshev second kind formula) Un(x)=sin(n+1)acosxsinacosx
In general we don't have nice formulae. But we can always construct them via Gram–Schmidt:
Proposition (Gram–Schmidt) Define p0(x)=1q0(x)=1‖p0‖pn+1(x)=xqn(x)−n∑k=0⟨xqn,qk⟩qk(x)qn+1(x)=pn+1(x)‖pn‖
Proof By linearity we have ⟨pn+1,qj⟩=⟨xqn−n∑k=0⟨xqn,qk⟩qk,qj⟩=⟨xqn,qj⟩−⟨xqn,qj⟩⟨qj,qj⟩=0
⬛️
Let's make our own family:
x = Fun()
w = exp(x)
ip = (f,g) -> sum(f*g*w)
nrm = f -> sqrt(ip(f,f))
n = 10
q = Array{Fun}(undef,n)
p = Array{Fun}(undef,n)
p[1] = Fun(1, -1 .. 1 )
q[1] = p[1]/nrm(p[1])
for k=1:n-1
p[k+1] = x*q[k]
for j=1:k
p[k+1] -= ip(p[k+1],q[j])*q[j]
end
q[k+1] = p[k+1]/nrm(p[k+1])
end
sum(q[2]*q[4]*w)
2.338624086051233e-16
p = plot(; legend=false)
for k=1:10
plot!(q[k])
end
p
A basic usage of orthogonal polynomials is for polynomial approximation. Suppose f(x) is a degree n−1 polynomial. Since {p0(x),…,pn−1(x)} span all degree n−1 polynomials, we know that f(x)=n−1∑k=0fkpk(x)
Here, we demonstrate this with Chebyshev polynomials:
f = Fun(x -> 1 + x + x^2 + x^3, Chebyshev())
f₀, f₁, f₂, f₃ = f.coefficients
@show f₀, f₁, f₂, f₃
x = 0.1
@show f₀*1 + f₁*x + f₂*cos(2acos(x)) + f₃*cos(3acos(x))
@show 1 + x + x^2 + x^3;
(f₀, f₁, f₂, f₃) = (1.5, 1.7499999999999998, 0.5, 0.25000000000000006) f₀ * 1 + f₁ * x + f₂ * cos(2 * acos(x)) + f₃ * cos(3 * acos(x)) = 1.111 1 + x + x ^ 2 + x ^ 3 = 1.111
plot(Fun(exp))
plot!(Fun(t-> exp(cos(t)), -pi .. pi))
x = Fun()
ip = (f,g) -> sum(f*g/sqrt(1-x^2))
T₂ = cos.(2 .* acos.(x))
ip(T₂,f)/ip(T₂,T₂) # gives back f₂
0.5
Of course, if pk are othernormal than we don't need the denominator.
This can be extended to function approximation Provided the sum converges absolutely and uniformly in x, we can write f(x)=∞∑k=0fkpk(x).
Here we see that ex can be approximated by a Chebyshev approximation using 14 coefficients and is accurate to 16 digits:
f = Fun(x -> exp(x), Chebyshev())
@show f.coefficients
@show ncoefficients(f)
@show f(0.1) # equivalent to f.coefficients'*[cos(k*acos(x)) for k=0:ncoefficients(f)-1]
@show exp(0.1);
f.coefficients = [1.26607, 1.13032, 0.271495, 0.0443368, 0.00547424, 0.000542926, 4.49773e-5, 3.19844e-6, 1.99212e-7, 1.10368e-8, 5.5059e-10, 2.49796e-11, 1.03911e-12, 3.98969e-14] ncoefficients(f) = 14 f(0.1) = 1.1051709180756477 exp(0.1) = 1.1051709180756477
The accuracy of this approximation is typically dictated by the smoothness of f: the more times we can differentiate, the faster it converges. For analytic functions, it's dictated by the domain of analyticity, just like Laurent/Fourier series. In the case above, ex is entire hence we get faster than exponential convergence.
Chebyshev expansions work even when Taylor series do not. For example, the following function has poles at ±i5, which means the radius of convergence for the Taylor series is |x|<15, but Chebyshev polynomials continue to work on [−1,1]:
f = Fun( x -> 1/(25x^2 + 1), Chebyshev())
@show ncoefficients(f)
plot(f)
ncoefficients(f) = 189
This can be explained for Chebyshev expansion by noting that it is cosine expansion / Fourier expansion of an even function: f(x)=∞∑k=0fkTk(x)⇔f(cosθ)=∞∑k=0fkcoskθ
f = x -> 1/(25x^2 + 1)
portrait(-3..3, -3..3, z -> f((z+1/z)/2))
Hence we predict a rate of decay of about 1.2198−k:
f = Fun( x -> 1/(25x^2 + 1), Chebyshev())
plot(abs.(f.coefficients) .+ 1E-40; yscale=:log10, label="Chebyshev coefficients")
plot!( 1.2198.^(-(0:ncoefficients(f))); label="R^(-k)")
Sometimes, we want to incorporate the weight into the approximation. This is typically one of two forms, depending on the application: f(x)=w(x)∞∑k=0fkpk(x)
This is often the case with Hermite polynomials: on the real line polynomial approximation is unnatural unless the function approximated is a polynomial as otherwise the behaviour at ∞ is inconsistent, so what we really want is weighted approximation. Thus we can either use f(x)=e−x2∞∑k=0fkHk(x)
f = Fun(x -> 1+x +x^2, Hermite())
f(0.10)
1.109999999999997
# nonsense trying to approximating sech(x) by a degree 50 polynomial:
f = Fun(x -> sech(x), Hermite(), 51)
xx = -8:0.01:8
plot(xx, sech.(xx); ylims=(-10,10), label="sech x")
plot!(xx, f.(xx); label="f")
# weighted by works sqrt(w(x)) = exp(-x^2/2)
f = Fun(x -> sech(x), GaussWeight(Hermite(),1/2),101)
plot(xx, sech.(xx); ylims=(-10,10), label="sech x")
plot!(xx, f.(xx); label="f")
# weighted by w(x) = exp(-x^2) breaks again
f = Fun(x -> sech(x), GaussWeight(Hermite()),101)
plot(xx, sech.(xx); ylims=(-10,10), label="sech x")
plot(xx, f.(xx); label="f")
Note that correctly weighted Hermite, that is, with √w(x)=e−x2/2 look "nice":
p = plot()
for k=0:6
H_k = Fun(GaussWeight(Hermite(),1/2),[zeros(k);1])
plot!(xx, H_k.(xx); label="H_$k")
end
p
Compare this to weighting by w(x)=e−x2:
p = plot()
for k=0:6
H_k = Fun(GaussWeight(Hermite()),[zeros(k);1])
plot!(xx, H_k.(xx); label="H_$k")
end
p