#!/usr/bin/env python # coding: utf-8 # # *** # # Lecture 2 - Minima of Convex Functions # *** # $\newcommand{\vct}[1]{\mathbf{#1}}$ # $\newcommand{\mtx}[1]{\mathbf{#1}}$ # $\newcommand{\e}{\varepsilon}$ # $\newcommand{\norm}[1]{\|#1\|}$ # $\newcommand{\minimize}{\text{minimize}\quad}$ # $\newcommand{\maximize}{\text{maximize}\quad}$ # $\newcommand{\subjto}{\quad\text{subject to}\quad}$ # $\newcommand{\R}{\mathbb{R}}$ # $\newcommand{\trans}{T}$ # $\newcommand{\ip}[2]{\langle {#1}, {#2} \rangle}$ # $\newcommand{\zerovct}{\vct{0}}$ # In this lecture we will study the unconstrained problem # # [1] # \begin{equation*} # \minimize\ f(\vct{x}), # \end{equation*} # # where $\vct{x}\in \R^n$. **Optimality conditions** aim to identify properties that potential minimizers need to satisfy in relation to $f(\vct{x})$. We will review the well known local optimality conditions for differentiable functions from calculus. We then introduce convex functions and discuss some of their properties. # --- # ## Unconstrained optimization # --- # Solutions to [1] come in different flavours, as in the following definition. # # **Definition.** A point $\vct{x}^*\in \R^n$ is a # # * **global minimizer** of [1] if for all $\vct{x}\in U$, $f(\vct{x}^*)\leq f(\vct{x})$; # * a **local minimizer**, if there is an open neighbourhood $U$ of $\vct{x}$ such that $f(\vct{x}^*)\leq f(\vct{x})$ for all $\vct{x}\in U$; # * a **strict local minimizer**, if there is an open neighbourhood $U$ of $\vct{x}$ such that $f(\vct{x}^*)0$ guarantee that we have a local minimizer. These conditions generalise to higher dimension, but first we need to know what $f''(x)>0$ means when we have more than one variable. # # Recall that a matrix $\mtx{A}$ is **positive semidefinite**, written $\mtx{A}\succeq \zerovct$, if for every $\vct{x}\in \R^n$, $\vct{x}^{\top}\mtx{A}\vct{x}\geq 0$, and **positive definite**, written $\mtx{A}\succ \zerovct$, if $\vct{x}^{\top}\mtx{A}\vct{x}>0$. The property that the Hessian matrix is positive semidefinite is a multivariate generalization of the property that the second derivative is nonnegative. The known conditions for a minimizer involving the second derivative generalise accordingly. # # **Theorem.** Let $f\in C^2(U)$ for some open set $U$ and $\vct{x}^*\in U$. # If $\vct{x}^*$ is a local minimizer, then $\nabla f(\vct{x}^*)=0$ and $\nabla^2f(\vct{x}^*)$ is positive semidefinite. Conversely, if $\nabla f(\vct{x}^*)=\zerovct$ and $\nabla^2f(\vct{x}^*)$ is positive definite, then $\vct{x}^*$ is a strict local minimizer. # # Unfortunately, the above criteria are not able to identify global minimizers, as differentiability is a local property. If, however, the function is **convex**, then we can say a lot more! # --- # ## Convex sets and functions # --- # We now come to the central notion of this course. # # **Definition.** A set $C\subseteq \R^n$ is **convex** if for all $\vct{x},\vct{y}\in C$ and $\lambda \in [0,1]$, the line $\lambda \vct{x}+(1-\lambda)\vct{y}\in C$. A **convex body** is a convex set that is closed and bounded. # # **Definition.** # Let $S\subseteq \R^n$. A function $f\colon S\to \R$ is called **convex** if $S$ is convex and for all $\vct{x},\vct{y}\in \R^n$ and $\lambda\in [0,1]$, # # \begin{equation*} # f(\lambda \vct{x}+(1-\lambda)\vct{y})\leq \lambda f(\vct{x})+(1-\lambda)f(\vct{y}). # \end{equation*} # # The function $f$ is called **strictly convex** if # # \begin{equation*} # f(\lambda \vct{x}+(1-\lambda)\vct{y})< \lambda f(\vct{x})+(1-\lambda)f(\vct{y}). # \end{equation*} # # A function $f$ is called **concave**, if $-f$ is convex. # # The next figure illustrates how a convex set in $\R^2$ and a function of one variable looks like. The graph of the function lies below any line connecting two points on it. # # ![Convex function](images/convset.png) # # Convex function have pleasant properties, while at the same time covering many of the functions that arise in applications. Perhaps the most important property is that local minima are global minima. # # **Theorem.** Let $f\colon \R^n\to \R$ be a convex function. Then any local minimizer of $f$ is a global minimizer. # # **Proof.** Let $\vct{x}^*$ be a local minimizer and assume that it is not a global minimizer. Then there exists a vector $\vct{y}\in \R^n$ such that $f(\vct{y})