A student wants to have a nice snack buying brownies and cheesecakes. He/She wants 1) to reduce total cost, and 2) to satisfy certain dietary requirements.
Chocolate | Sugar | Cheese | Cost | |
---|---|---|---|---|
Brownie | 3 | 2 | 2 | 50 |
Cheesecake | 0 | 4 | 5 | 80 |
------------ | ---------- | ------ | -------- | ------- |
Requirement | 6 | 10 | 8 |
Two situations:
What should be the optimal buying plan?
# Add required packages
Pkg.update()
Pkg.add("Optim")
Pkg.add("JuMP")
Pkg.add("Cbc") # for LP and MILP (mixed-integer linear program)
Pkg.add("Ipopt") # for QP and other NLPs
Pkg.add("Clp")
Pkg.add("SCS") # for conic programs
Pkg.add("Gurobi") # for conic and integer programs (commercial license required)
# Pkg.add("CPLEX") # for conic and integer programs (commercial)
using JuMP
#using Cbc
using Gurobi # for LP and MILP
#snack = Model(solver=CbcSolver())
snack = Model(solver=GurobiSolver())
@variable(snack, b >= 0, Int) # @variable(snack, b >= 0, Int)
@variable(snack, c >= 0, Int) # @variable(snack, c >= 0, Int)
@objective(snack, Min, 50b + 80c)
@constraints(snack, begin
3b >= 6
2b + 4c >= 10
2b + 5c >= 8
end)
status = solve(snack)
b = getvalue(b)
c = getvalue(c)
cost = getobjectivevalue(snack)
println("Solver status = $(status)")
println("Brownie $(b), Cheesecake $(c): Total cost $(cost) cents")
INFO: Recompiling stale cache file /Users/sklee/.julia/lib/v0.5/Gurobi.ji for module Gurobi.
Optimize a model with 3 rows, 2 columns and 5 nonzeros Coefficient statistics: Matrix range [2e+00, 5e+00] Objective range [5e+01, 8e+01] Bounds range [0e+00, 0e+00] RHS range [6e+00, 1e+01] Found heuristic solution: objective 260 Presolve removed 2 rows and 0 columns Presolve time: 0.01s Presolved: 1 rows, 2 columns, 2 nonzeros Variable types: 0 continuous, 2 integer (0 binary) Root relaxation: objective 2.300000e+02, 1 iterations, 0.00 seconds Nodes | Current Node | Objective Bounds | Work Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time * 0 0 0 230.0000000 230.00000 0.00% - 0s Explored 0 nodes (1 simplex iterations) in 0.04 seconds Thread count was 4 (of 4 available processors) Optimal solution found (tolerance 1.00e-04) Best objective 2.300000000000e+02, best bound 2.300000000000e+02, gap 0.0% Solver status = Optimal Brownie 3.0, Cheesecake 1.0: Total cost 230.0 cents
$ \newcommand{\R}{\mathbb{R}} $
Give a data set $\{(x_i, y_i)\}_{i=1}^m$:
In SVM, we want to find a separating hyperplane in forms of $f_{w,b}(x) = \langle x, w \rangle + b = 0$:
- Separating hyperplane: $f_{w,b}(x) = 0$ - Supporting hyperplanes: $f_{w,b}(x) = \pm 1$
Due to the construction of the sep. hyperplane, a point $(x_i,y_i)$ is correctly classified if $$ y_i \cdot f_{w,b}(x_i) \ge 0. $$ We want to find parameters $(w,b)$ satisfying this condition for all $i=1,\dots,m$.
In practice, we try to find a more robust classifier: $$ y_i \cdot f_{w,b}(x_i) \ge 1, \;\; i=1,\dots,m. $$
That is, $$ 1 - y_i f_{w,b}(x_i) \begin{cases} \le 0 & \text{if correctly classified} \
0 & \text{if misclassified}
\end{cases} $$
So we can define an classification error function, $$ \ell(x_i,y_i;w,b) = (1- y_i f_{w,b}(x_i))_+ = \max\{0, 1-y_i f_{w,b}(x_i)\} . $$
This is called the hinge loss function.
Minimizing the hinge loss of all data points and maximizing the margin can be described as:
\begin{equation} \begin{aligned} \min_{w,b} \;\; \frac{1}{m} \sum_{i=1}^m \max\{0, 1 - y_i (\langle w, x_i\rangle + b)\} + \frac{\lambda}{2} \|w\|^2 . \end{aligned} \end{equation}with a balancing parameter $\lambda>0$ to be specified.
- This formulation can be difficult to handle since the first term is not differentiable.
We can rewrite the above formulation equivalently to
\begin{equation} \begin{aligned} \min_{w \in \R^n,b\in \R,\xi \in \R^m} &\;\; \frac{1}{m} \sum_{i=1}^m \xi_i + \frac{\lambda}{2} \|w\|^2 \\ \text{s.t.} &\;\; \xi_i \ge 1 - y_i(\langle w, x_i\rangle + b), \;\; i=1,\dots, m,\\ &\;\; \xi_i \ge 0, \;\; i=1,\dots,m. \end{aligned} \end{equation}Pkg.add("RDatasets")
INFO: Nothing to be done
using RDatasets
iris = dataset("datasets", "iris")
SepalLength | SepalWidth | PetalLength | PetalWidth | Species | |
---|---|---|---|---|---|
1 | 5.1 | 3.5 | 1.4 | 0.2 | setosa |
2 | 4.9 | 3.0 | 1.4 | 0.2 | setosa |
3 | 4.7 | 3.2 | 1.3 | 0.2 | setosa |
4 | 4.6 | 3.1 | 1.5 | 0.2 | setosa |
5 | 5.0 | 3.6 | 1.4 | 0.2 | setosa |
6 | 5.4 | 3.9 | 1.7 | 0.4 | setosa |
7 | 4.6 | 3.4 | 1.4 | 0.3 | setosa |
8 | 5.0 | 3.4 | 1.5 | 0.2 | setosa |
9 | 4.4 | 2.9 | 1.4 | 0.2 | setosa |
10 | 4.9 | 3.1 | 1.5 | 0.1 | setosa |
11 | 5.4 | 3.7 | 1.5 | 0.2 | setosa |
12 | 4.8 | 3.4 | 1.6 | 0.2 | setosa |
13 | 4.8 | 3.0 | 1.4 | 0.1 | setosa |
14 | 4.3 | 3.0 | 1.1 | 0.1 | setosa |
15 | 5.8 | 4.0 | 1.2 | 0.2 | setosa |
16 | 5.7 | 4.4 | 1.5 | 0.4 | setosa |
17 | 5.4 | 3.9 | 1.3 | 0.4 | setosa |
18 | 5.1 | 3.5 | 1.4 | 0.3 | setosa |
19 | 5.7 | 3.8 | 1.7 | 0.3 | setosa |
20 | 5.1 | 3.8 | 1.5 | 0.3 | setosa |
21 | 5.4 | 3.4 | 1.7 | 0.2 | setosa |
22 | 5.1 | 3.7 | 1.5 | 0.4 | setosa |
23 | 4.6 | 3.6 | 1.0 | 0.2 | setosa |
24 | 5.1 | 3.3 | 1.7 | 0.5 | setosa |
25 | 4.8 | 3.4 | 1.9 | 0.2 | setosa |
26 | 5.0 | 3.0 | 1.6 | 0.2 | setosa |
27 | 5.0 | 3.4 | 1.6 | 0.4 | setosa |
28 | 5.2 | 3.5 | 1.5 | 0.2 | setosa |
29 | 5.2 | 3.4 | 1.4 | 0.2 | setosa |
30 | 4.7 | 3.2 | 1.6 | 0.2 | setosa |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
@show size(iris)
unique(iris[:,end])
size(iris) = (150,5)
3-element DataArrays.DataArray{String,1}: "setosa" "versicolor" "virginica"
class = iris[:Species]
species = unique(class)
show(species)
color = ["red" for i=1:size(iris,1)]
color[class.==species[2]] = "blue"
color[class.==species[3]] = "yellow";
String["setosa","versicolor","virginica"]
using PyPlot
fig = figure("iris_scatterplot",figsize=(15,5))
ax = subplot(121)
for i=1:length(species)
idx = class.==species[i];
scatter(iris[idx,:SepalLength], iris[idx,:SepalWidth],color=color[idx],s=30,alpha=0.5,label=species[i])
end
title("IRIS Data")
xlabel("SepalLength")
ylabel("SepalWidth")
grid("on")
legend(loc=4, fontsize="small")
ax = subplot(122)
for i=1:length(species)
idx = class.==species[i];
scatter(iris[idx,:PetalLength], iris[idx,:PetalWidth],color=color[idx],s=30,alpha=0.5,label=species[i])
end
title("IRIS Data")
xlabel("PetalLength")
ylabel("PetalWidth")
grid("on")
legend(loc=4, fontsize="small")
PyObject <matplotlib.legend.Legend object at 0x31ba58fd0>
x = Array(iris[:,1:(end-1)])
m, p = size(x)
y = -ones(m)
@printf "Making binary classification problem: '%s (+1)' vs. the rest (-1)" species[3]
y[class.==species[3]] = 1
C = 10/m;
Making binary classification problem: 'virginica (+1)' vs. the rest (-1)
using JuMP
using Ipopt
# using Gurobi
svm = Model(solver=IpoptSolver()) #print_level=0
# svm = Model(solver=GurobiSolver()) #Method=1 (Simplex), Method=2 (Barrier)
@variables(svm, begin
w[i=1:p]
b
ξ[i=1:m] >= 0
end)
@objective(svm, Min, .5*dot(w,w) + C*sum{ ξ[i], i=1:m})
@constraints(svm, begin
err[i=1:m], ξ[i] >= 1 - y[i]*( dot(w, x[i,:]) + b )
end)
status = solve(svm)
w = getvalue(w)
b = getvalue(b);
println("Solver status = $(status)")
This is Ipopt version 3.12.4, running with linear solver mumps. NOTE: Other linear solvers might be more efficient (see Ipopt documentation). Number of nonzeros in equality constraint Jacobian...: 0 Number of nonzeros in inequality constraint Jacobian.: 900 Number of nonzeros in Lagrangian Hessian.............: 4 Total number of variables............................: 155 variables with only lower bounds: 150 variables with lower and upper bounds: 0 variables with only upper bounds: 0 Total number of equality constraints.................: 0 Total number of inequality constraints...............: 150 inequality constraints with only lower bounds: 150 inequality constraints with lower and upper bounds: 0 inequality constraints with only upper bounds: 0 iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls 0 9.9999900e-02 9.90e-01 1.01e+00 -1.0 0.00e+00 - 0.00e+00 0.00e+00 0 1 1.4493856e-01 9.83e-01 1.15e+00 -1.0 2.39e+00 - 1.40e-02 1.66e-02f 1 2 1.1692311e+00 7.61e-01 4.17e+00 -1.0 2.52e+00 - 2.64e-02 2.26e-01f 1 3 5.7246727e+00 0.00e+00 1.35e+00 -1.0 1.99e+00 - 3.96e-01 1.00e+00f 1 4 6.4453245e+00 0.00e+00 4.85e-01 -1.0 2.61e-01 - 6.55e-01 1.00e+00f 1 5 6.4796754e+00 0.00e+00 8.45e-02 -1.7 3.96e-01 - 8.06e-01 1.00e+00f 1 6 4.9752628e+00 0.00e+00 2.95e-02 -2.5 1.09e+00 - 6.23e-01 1.00e+00f 1 7 4.1280055e+00 0.00e+00 4.15e-02 -2.5 2.06e+00 - 1.58e-01 5.35e-01f 1 8 3.7220527e+00 0.00e+00 2.83e-02 -2.5 2.01e+00 - 3.54e-01 4.04e-01f 1 9 3.3445992e+00 0.00e+00 1.79e-02 -3.8 1.27e+00 - 3.16e-01 3.99e-01f 1 iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls 10 3.0983641e+00 0.00e+00 1.03e-02 -3.8 9.40e-01 - 4.53e-01 4.28e-01f 1 11 2.9622963e+00 0.00e+00 5.76e-03 -3.8 5.79e-01 - 4.15e-01 4.39e-01f 1 12 2.8660943e+00 0.00e+00 1.03e-02 -3.8 3.31e-01 - 4.61e-01 6.06e-01f 1 13 2.8456747e+00 0.00e+00 2.36e-02 -3.8 2.92e-01 - 7.05e-01 3.12e-01f 1 14 2.8244123e+00 0.00e+00 1.12e-02 -3.8 2.10e-01 - 6.68e-01 5.23e-01f 1 15 2.8063266e+00 0.00e+00 3.04e-03 -3.8 2.58e-01 - 6.45e-01 1.00e+00f 1 16 2.8063511e+00 0.00e+00 1.50e-09 -3.8 1.28e-02 - 1.00e+00 1.00e+00f 1 17 2.7860239e+00 0.00e+00 2.65e-04 -5.7 5.68e-02 - 9.29e-01 8.91e-01f 1 18 2.7836952e+00 0.00e+00 1.84e-11 -5.7 1.14e-02 - 1.00e+00 1.00e+00f 1 19 2.7834140e+00 0.00e+00 1.72e-05 -8.6 1.34e-02 - 9.64e-01 9.73e-01f 1 iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls 20 2.7834029e+00 0.00e+00 2.51e-14 -8.6 6.46e-03 - 1.00e+00 1.00e+00f 1 21 2.7834019e+00 0.00e+00 2.51e-14 -8.6 2.33e-03 - 1.00e+00 1.00e+00f 1 22 2.7834017e+00 0.00e+00 2.51e-14 -8.6 5.28e-04 - 1.00e+00 1.00e+00h 1 Number of Iterations....: 22 (scaled) (unscaled) Objective...............: 2.7834017157680875e+00 2.7834017157680875e+00 Dual infeasibility......: 2.5077162568720723e-14 2.5077162568720723e-14 Constraint violation....: 0.0000000000000000e+00 0.0000000000000000e+00 Complementarity.........: 9.8030507954914817e-09 9.8030507954914817e-09 Overall NLP error.......: 9.8030507954914817e-09 9.8030507954914817e-09 Number of objective function evaluations = 23 Number of objective gradient evaluations = 23 Number of equality constraint evaluations = 0 Number of inequality constraint evaluations = 23 Number of equality constraint Jacobian evaluations = 0 Number of inequality constraint Jacobian evaluations = 23 Number of Lagrangian Hessian evaluations = 22 Total CPU secs in IPOPT (w/o function evaluations) = 0.024 Total CPU secs in NLP function evaluations = 0.001 EXIT: Optimal Solution Found. Solver status = Optimal
pred = [Float64((dot(x[i,:],w) + b)[1]) for i=1:m]
correct = y.*pred
@printf "Prediction accuracy on the training set = %.2f %%" sum(correct .> 0) / m
Prediction accuracy on the training set = 0.96 %
using JuMP
using Ipopt
nlp = Model(solver=IpoptSolver())
@variable(nlp, x>=0)
@NLobjective(nlp, Min, cos(x))
@NLconstraint(nlp, x^2 + sin(x) >= 2*pi^2)
status = solve(nlp)
x = getvalue(x)
println("Solver status = $(status)")
@show x
This is Ipopt version 3.12.4, running with linear solver mumps. NOTE: Other linear solvers might be more efficient (see Ipopt documentation). Number of nonzeros in equality constraint Jacobian...: 0 Number of nonzeros in inequality constraint Jacobian.: 0 Number of nonzeros in Lagrangian Hessian.............: 1 Total number of variables............................: 1 variables with only lower bounds: 1 variables with lower and upper bounds: 0 variables with only upper bounds: 0 Total number of equality constraints.................: 0 Total number of inequality constraints...............: 0 inequality constraints with only lower bounds: 0 inequality constraints with lower and upper bounds: 0 inequality constraints with only upper bounds: 0 iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls 0 9.9995000e-01 0.00e+00 1.01e+00 -1.0 0.00e+00 - 0.00e+00 0.00e+00 0 1 9.9383352e-01 0.00e+00 1.21e-01 -1.0 1.01e-01 - 8.91e-01 1.00e+00f 1 2 4.1940846e-02 0.00e+00 9.99e-01 -2.5 1.42e+00 0.0 8.89e-02 1.00e+00f 1 3 -9.9452346e-01 0.00e+00 1.03e-01 -2.5 3.43e+00 -0.5 1.00e+00 5.00e-01f 2 4 -9.9999985e-01 0.00e+00 3.79e-04 -2.5 1.04e-01 - 1.00e+00 1.00e+00f 1 5 -1.0000000e+00 0.00e+00 1.55e-09 -3.8 5.01e-04 - 1.00e+00 1.00e+00f 1 6 -1.0000000e+00 0.00e+00 1.85e-11 -5.7 4.74e-05 - 1.00e+00 1.00e+00f 1 7 -1.0000000e+00 0.00e+00 2.50e-14 -8.6 5.87e-07 - 1.00e+00 1.00e+00f 1 Number of Iterations....: 7 (scaled) (unscaled) Objective...............: -1.0000000000000000e+00 -1.0000000000000000e+00 Dual infeasibility......: 2.4973503756520032e-14 2.4973503756520032e-14 Constraint violation....: 0.0000000000000000e+00 0.0000000000000000e+00 Complementarity.........: 2.5062483207201642e-09 2.5062483207201642e-09 Overall NLP error.......: 2.5062483207201642e-09 2.5062483207201642e-09 Number of objective function evaluations = 13 Number of objective gradient evaluations = 8 Number of equality constraint evaluations = 0 Number of inequality constraint evaluations = 0 Number of equality constraint Jacobian evaluations = 0 Number of inequality constraint Jacobian evaluations = 0 Number of Lagrangian Hessian evaluations = 7 Total CPU secs in IPOPT (w/o function evaluations) = 0.003 Total CPU secs in NLP function evaluations = 0.000 EXIT: Optimal Solution Found. Solver status = Optimal x = 3.141592654387532
3.141592654387532