This notebook arose in response to a question on StackOverflow about how to optimize a function with probability simplex constraints in python (see http://stackoverflow.com/questions/32252853/optimization-with-python-scipy-optimize). This is a topic I've thought about a lot for our paper on optimal immune repertoires so I was interested to see what other people had to say about it.
For a given $\boldsymbol y$ and $\gamma$ find the $\boldsymbol x^\star$ that maximizes the following expression over the probability simplex:
$$\max_{x_i \geq 0, \, \sum_i x_i = 1} \left[\sum_i \left(\frac{x_i}{y_i}\right)^\gamma\right]^{1/\gamma}$$import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from scipy.optimize import minimize
def objective_function(x, y, gamma=0.2):
return -((x/y)**gamma).sum()**(1.0/gamma)
cons = ({'type': 'eq', 'fun': lambda x: np.array([sum(x) - 1])})
y = np.array([0.5, 0.3, 0.2])
initial_x = np.array([0.2, 0.3, 0.5])
opt = minimize(objective_function, initial_x, args=(y,), method='SLSQP',
constraints=cons, bounds=[(0, 1)] * len(initial_x))
opt
/usr/local/lib/python2.7/dist-packages/ipykernel/__main__.py:2: RuntimeWarning: invalid value encountered in power from ipykernel import kernelapp as app
status: 0 success: True njev: 7 nfev: 38 fun: -265.27701765861417 x: array([ 0.29466743, 0.33480631, 0.37052625]) message: 'Optimization terminated successfully.' jac: array([-265.27723694, -265.27757263, -265.27632904, 0. ]) nit: 7
Works on my machine (the poster on StackOverflow reported issues with this) and actually requires a surprisingly small number of function evaluations.
def trans_x(x):
x1 = x**2/(1.0+x**2)
z = np.hstack((x1, 1-sum(x1)))
return z
def F(x, y, gamma=0.2):
z = trans_x(x)
return -(((z/y)**gamma).sum())**(1./gamma)
opt = minimize(F, np.array([1., 1.]), args=(np.array(y),),
method='Nelder-Mead')
trans_x(opt.x), opt
/usr/local/lib/python2.7/dist-packages/ipykernel/__main__.py:8: RuntimeWarning: invalid value encountered in power
(array([ 0.29467183, 0.33481782, 0.37051034]), status: 0 nfev: 73 success: True fun: -265.27701755536299 x: array([ 0.64635885, 0.70946991]) message: 'Optimization terminated successfully.' nit: 39)
Works but needs a slightly higher number of function evaluations for convergence.
opt = minimize(F, np.array([0., 1.]), args=(np.array([0.2, 0.1, 0.8]), 2.0),
method='Nelder-Mead')
trans_x(opt.x), opt
(array([ 1., 1., -1.]), status: 0 nfev: 299 success: True fun: -11.249999999999998 x: array([ -3.10743912e+07, 3.96029532e+10]) message: 'Optimization terminated successfully.' nit: 103)
In general though this method can fail, as it does not enforce the non-negativity constraint on the third variable.
It turns our the problem is solvable analytically. One can start by writing down the Lagrangian of the (equality constrained) optimization problem:
$$L = \sum_i (x_i/y_i)^\gamma - \lambda \left(\sum x_i - 1\right)$$The optimal solution is found by setting the first derivative of this Lagrangian to zero:
$$0 = \partial L / \partial x_i = \gamma x_i^{(\gamma-1)/\gamma_i} - \lambda$$$$\Rightarrow x_i \propto y_i^{\gamma/(\gamma - 1)}$$Using this insight the optimization problem can be solved simply and efficiently:
def analytical(y, gamma=0.2):
x = y**(gamma/(gamma-1.0))
x /= np.sum(x)
return x
xanalytical = analytical(np.array(y))
xanalytical, objective_function(xanalytical, np.array(y))
(array([ 0.29466774, 0.33480719, 0.37052507]), -265.27701765929692)
This problem can also be solved using a projected gradient algorithm, but this will be for another time.