Notebook

.1. Without using a computer, suppose A is square upper triangular, and you perform Gram-Schmidt on the columns of A, what is the best you can say about the square matrix Q whose columns are the Gram-Schmidt vectors? (If you really need the computer, you can generate a triangular matrix with triu(randn(5,5)) for example.

.2. A square matrix is said to be upper-Hessenberg if it has non-zeros in the upper triangle and the diagonal below the main diagonal. Without using a computer, what generally can you say about the zero/nonzero pattern of the square matrix Q whose columns are the Gram-Schmidt vectors? (If you really need the computer you can generate such an H with triu(randn(5,5),-1) for example.)

.3. If A is square upper triangular, does the determinant of A depend on the entry in the top right? If so, how?

.4. If A is upper Hessenberg, does the determinant of A depend on the entry in the top right? If so, how?

.5. In machine learning, a very famous binary classifier is the perceptron. Training data consists of m pairs of vectors xᵢ in ℜⁿ and scalars yᵢ. Very often the yᵢ are ±1, but that is not required. (For example, using the MNIST data set of last week, the xᵢ could be the flattened images of a zero or a one, and the yᵢ could be the labels 0 or 1.)

Let X be the m x (n+1) matrix whose first column is the ones vector and the remaining n columns contains in each row the xᵢ. Set up, but do not solve, the least squares problem for w in ℜⁿ and b in ℜ whose solution would give the best answer to w⋅xᵢ + b ≈ yᵢ for i=1:m. Simply write down the normal equations without solving for the vector w or the scalar b.

.6. If the rows of A are independent is AᵀA invertible? Give a counterexample or argue there can not be one.

.7. Give an example of each of the following
(a) A matrix Q that has orhonormal columns but QQᵀ ≠ I.
(b) Two orthogonal vectors that are not linearly independent.
(c) An orthonormal basis for ℜ³ that includes the vector q₁ = (1,1,1)/√3.

.8. Do Gram Schmidt by hand on a=(1,-1,0,0), b=(0,1,-1,0), c=(0,0,1,-1).
What vector is perpendicular to a, b, and c? What vector is perpendicular to q₁,q₂,q₃? Explain.

.9. We have an nx3 matrix A. We perform QR obtaining a Q that is nx3 and an R that is 3 by 3. We also perform the SVD, and obtain a U that is nx3. Are Q and U the same matrix? Are QQᵀ and UUᵀ the same matrix? Explain your answers. Okay to use the computer to check or investigate, but this problem is best done without computer.

.10. A student wanted to draw the normalized Legendre polynomials using QR. Remembering the normalized Legendre polynomials are what the Gram-Schidt process computes on the monomials: 1,x,x²,x³,...., the student ran QR on the matrix M defined in the next cell. (Warning: the standard definition of Legendre-Polynomials does not orthonormalize but rather puts P(1)=1.)

In [46]:

h=.01
x = -1:h:1   # range of numbers from -1 to 1 with stepsize h
M = x.^(0:5)' # matrix of 0th power, 1st power, ... , 5th power of x

Out[46]:

201×6 Array{Float64,2}:
 1.0  -1.0   1.0     -1.0       1.0       -1.0     
 1.0  -0.99  0.9801  -0.970299  0.960596  -0.95099 
 1.0  -0.98  0.9604  -0.941192  0.922368  -0.903921
 1.0  -0.97  0.9409  -0.912673  0.885293  -0.858734
 1.0  -0.96  0.9216  -0.884736  0.849347  -0.815373
 1.0  -0.95  0.9025  -0.857375  0.814506  -0.773781
 1.0  -0.94  0.8836  -0.830584  0.780749  -0.733904
 1.0  -0.93  0.8649  -0.804357  0.748052  -0.695688
 1.0  -0.92  0.8464  -0.778688  0.716393  -0.659082
 1.0  -0.91  0.8281  -0.753571  0.68575   -0.624032
 1.0  -0.9   0.81    -0.729     0.6561    -0.59049 
 1.0  -0.89  0.7921  -0.704969  0.627422  -0.558406
 1.0  -0.88  0.7744  -0.681472  0.599695  -0.527732
 ⋮                                         ⋮       
 1.0   0.89  0.7921   0.704969  0.627422   0.558406
 1.0   0.9   0.81     0.729     0.6561     0.59049 
 1.0   0.91  0.8281   0.753571  0.68575    0.624032
 1.0   0.92  0.8464   0.778688  0.716393   0.659082
 1.0   0.93  0.8649   0.804357  0.748052   0.695688
 1.0   0.94  0.8836   0.830584  0.780749   0.733904
 1.0   0.95  0.9025   0.857375  0.814506   0.773781
 1.0   0.96  0.9216   0.884736  0.849347   0.815373
 1.0   0.97  0.9409   0.912673  0.885293   0.858734
 1.0   0.98  0.9604   0.941192  0.922368   0.903921
 1.0   0.99  0.9801   0.970299  0.960596   0.95099 
 1.0   1.0   1.0      1.0       1.0        1.0

In [34]:

using PyPlot

INFO: Recompiling stale cache file /Users/edelman/.julia/lib/v0.6/PyPlot.ji for module PyPlot.

In [43]:

plot(x,M)

Out[43]:

6-element Array{PyCall.PyObject,1}:
 PyObject <matplotlib.lines.Line2D object at 0x134d52850>
 PyObject <matplotlib.lines.Line2D object at 0x134d52a50>
 PyObject <matplotlib.lines.Line2D object at 0x134d52b90>
 PyObject <matplotlib.lines.Line2D object at 0x134d52cd0>
 PyObject <matplotlib.lines.Line2D object at 0x134d52e10>
 PyObject <matplotlib.lines.Line2D object at 0x134d52f50>

In [47]:

Q,R = qr(M)
plot(x,Q/√h)

Out[47]:

6-element Array{PyCall.PyObject,1}:
 PyObject <matplotlib.lines.Line2D object at 0x135311fd0>
 PyObject <matplotlib.lines.Line2D object at 0x13531e210>
 PyObject <matplotlib.lines.Line2D object at 0x13531e350>
 PyObject <matplotlib.lines.Line2D object at 0x13531e490>
 PyObject <matplotlib.lines.Line2D object at 0x13531e5d0>
 PyObject <matplotlib.lines.Line2D object at 0x13531e710>

Explain why the sqrt of h is necessary in the cell above to get a discrete approximation to the continuous orthonormal Legendre polynomials.

.11. Show that if λ>0, then XᵀX + λI is invertible, regardless as to whether X has independent columns or not. (This matrix arises in regularization which is a fancy word for fixing a numerical difficulty with a kind of hack.)

In [ ]:

≥