$\newcommand{\vct}[1]{\mathbf{#1}}$ $\newcommand{\mtx}[1]{\mathbf{#1}}$ $\newcommand{\e}{\varepsilon}$ $\newcommand{\norm}[1]{\|#1\|}$ $\newcommand{\minimize}{\text{minimize}\quad}$ $\newcommand{\maximize}{\text{maximize}\quad}$ $\newcommand{\subjto}{\quad\text{subject to}\quad}$ $\newcommand{\R}{\mathbb{R}}$ $\newcommand{\trans}{T}$ $\newcommand{\ip}[2]{\langle {#1}, {#2} \rangle}$ $\newcommand{\zerovct}{\vct{0}}$ $\newcommand{\diff}[1]{\mathrm{d}{#1}}$ $\newcommand{\dom}{\mathrm{dom} }$
Find examples of
(a) A function $f\in C^2(\R)$ with a strict minimizer $x$ such that $f''(x)=0$ (that is, the second derivative is not positive definite).
(b) A function $f\colon \R\to \R$ with a strict minimizer $x^*$ that is not an isolated local minimizer. Hint: Consider a rapidly oscillating function that has minima that are arbitrary close together, but not equal.
For this problem you might want to recall some linear algebra.
(a) Let $\mtx{A}\in \R^{n\times n}$ by a symmetric matrix, $\vct{b}\in \R^n$ and $c\in \R$. Show that the quadratic function
\begin{equation*} f(\vct{x}) = \frac{1}{2}\vct{x}^{\trans}\mtx{A}\vct{x}+\vct{b}^{\trans}\vct{x}+c \end{equation*}
with symmetric $\mtx{A}$ is convex if and only if $\mtx{A}$ is positive semidefinite.
(b) Now let $\mtx{A}\in \R^{m\times n}$ by an arbitrary matrix. Show that the function
\begin{equation*} f(\vct{x}) = \norm{\mtx{A}\vct{x}-\vct{b}}_2^2 \end{equation*}is convex (the $2$-norm is defined as $\norm{\vct{x}}_2=\vct{x}^{\trans}\vct{x}$). Moreover, if $m\geq n$ and the matrix $\mtx{A}$ has rank $m$, then it is strictly convex and the unique minimizer is
\begin{equation*} \vct{x}^* = (\mtx{A}^{\trans}\mtx{A})^{-1}\mtx{A}^{\trans}\vct{b}. \end{equation*}
A set $S\subseteq \R^n$ is called convex, if for any $\vct{x},\vct{y}\in S$ and $\lambda\in [0,1]$,
\begin{equation*} \lambda \vct{x}+(1-\lambda)\vct{y}\in S. \end{equation*}In words, for any two points in $S$, the line segment joining them is also in $S$. Which of the following sets are convex?
(a) $S = \{\vct{x}\in \R^3 \mid \norm{\vct{x}}_2=1\}$ (the unit sphere);
(b) $S = \{\vct{x}\in \R^2 \mid 1\leq x_1-x_2< 2\}$;
(c) $S = \{\vct{x}\in \R^n \mid |x_1|+\cdots+|x_n|\leq 1\}$;
(d) $S = \mathcal{S}^n_{+}\subset \R^{n\times n}$, the set of symmetric, positive semidefinite matrices.
For this problem we generalize the notion of convexity to function not necessarily defined on all of $\R^n$. Denote by $\dom f$ the domain of $f$, i.e., the set of $\vct{x}$ on which $f(\vct{x})$ attains a finite value. A function $f$ is called convex, if $\dom f$ is a convex set and for all $\vct{x},\vct{y}\in \dom f$ and $\lambda\in [0,1]$,
\begin{equation*} f(\lambda \vct{x}+(1-\lambda)\vct{y})\leq \lambda f(\vct{x})+(1-\lambda)f(\vct{y}). \end{equation*}Which of the following functions are convex?
(a) $f(x) = \log(x)$ on $\R_{++}$ (the positive real numbers);
(b) $f(x) = x^4$ on $\R$;
(c) $f(\vct{x}) = x_1x_2$ on $\R^2_{++}$;
(d) $f(\vct{x}) = x_1/x_2$ on $\R^2_{++}$;
(e) $f(x)=e^x-1$ on $\R$;
(f) $f(\vct{x}) = \max_i x_i$ on $\R^n$.
In engineering applications (for example, in the synthesis of linear time-invariant dynamical systems) one sometimes encounters a problem of the form
\begin{equation*} \minimize \norm{\mtx{A}\vct{x}-\vct{b}}_\infty, \end{equation*}with $\mtx{A}\in \R^{m\times n}$, $\vct{b}\in \R^m$ and $\norm{\vct{x}}_\infty = \max_{1\leq i\leq n} |x_i|$ is the $\infty$-norm.
(a) Draw the unit circle $\{\vct{x}\in \R^2 \mid \norm{\vct{x}}_\infty \leq 1\}$.
(b) Formulate a linear programming problem $\mathcal{P}$ with decision variables $(\vct{x},t)$, such that if $(\vct{x}^*,t^*)$ is the unique minimizer of $\mathcal{P}$, then $\vct{x}^*$ is the unique minimizer of the above problem.
Even though our problem is not a linear programming problem (the objective is not linear), it is equivalent to one, in the sense that a minimizer can be read off the solution of a linear programming problem.
Using Python or another computing system, compute and plot the sequence of points $\vct{x}_k$, starting with $\vct{x}_0=(0,0)^{\trans}$, for the gradient descent algorithm for the problem
\begin{equation*} \minimize \norm{\mtx{A}\vct{x}-\vct{b}}_2^2 \end{equation*}with data
\begin{equation*} \mtx{A} = \begin{pmatrix} 1 & 2\\ 2 & 1\\ -1 & 0 \end{pmatrix}, \quad \vct{b} = \begin{pmatrix} 10\\-1\\0 \end{pmatrix}. \end{equation*}
Consider the Rosenbrock function in $\R^2$,
\begin{equation*} f(\vct{x}) = 100(x_2-x_1^2)^2 + (1-x_1)^2. \end{equation*}
Compute the gradient $\nabla f$ and the Hessian $\nabla^2 f$. Show that $\vct{x}^*=(1,1)^{\trans}$ is the only local minimizer of this function, and that the Hessian at this point is positive definite.
Using Python or another computing system, draw a contour plot of the Rosenbrock function.