###### The latest version of this IPython notebook is available at http://github.com/jckantor/ESTM60203 for noncommercial use under terms of the Creative Commons Attribution Noncommericial ShareAlike License (CC BY-NC-SA 4.0).¶

J.C. Kantor ([email protected])

# Optimal Mixture and Blending Problems¶

This IPython notebook demonstrates the development of models to solve optimal blending and mixture problems using GLPK/MathProg.

### Initializations¶

In [1]:
from IPython.core.display import HTML

Out[1]:

## Background¶

Mixture and blending problems are among the most commonly encountered optimization problems in real world applications. The basic problem is to determine a blend (mixture) of components maximizing subject to constraints on the blend. Examples include diet, product blending, financial portfolios, and many, many others.

This notebook uses a the diet problem from Chapter 2 of the AMPL book to demonstrate the formulation and solution of blending problems in MathProg. The problem is solved several times in order to demonstrate features of MathProg useful in constructing models that can be reused for other data sets, and extended to much larger applications.

## Diet Problem¶

We'll begin by reviewing data for The data is organized as three pandas DataFrames (some information on DataFrames is here and here). For larger scale problems, this data would be extracted from spreadsheets or databases.

### Weekly Nutrition Requirements¶

The first data set contains lower and upper bounds on weekly nutrition expressed a percentage of the recommended minimum daily requirements.

In [1]:
import pandas as pd

req = {
'A' : [700, 10000],
'C' : [700, 10000],
'B1': [700, 10000],
'B2': [700, 10000]}

display(pd.DataFrame(req.values(), index = req.keys(), columns = ['n_min', 'n_max']))

n_min n_max
A 700 10000
C 700 10000
B1 700 10000
B2 700 10000

4 rows × 2 columns

### Nutrition Data¶

The second data set includes 'label data' from some typical package goods.

In [14]:
nutr = {
'BEF' : {'A': 60, 'C': 20, 'B1': 10, 'B2': 15},
'CHK' : {'A':  8, 'C':  0, 'B1': 20, 'B2': 20},
'FSH' : {'A':  8, 'C': 10, 'B1': 15, 'B2': 10},
'HAM' : {'A': 40, 'C': 40, 'B1': 35, 'B2': 10},
'MCH' : {'A': 15, 'C': 35, 'B1': 15, 'B2': 15},
'MTL' : {'A': 70, 'C': 30, 'B1': 15, 'B2': 15},
'SPG' : {'A': 25, 'C': 50, 'B1': 25, 'B2': 15},
'TUR' : {'A': 60, 'C': 20, 'B1': 15, 'B2': 10}}

display(pd.DataFrame.from_dict(nutr).transpose())

A B1 B2 C
BEF 60 10 15 20
CHK 8 20 20 0
FSH 8 15 10 10
HAM 40 35 10 40
MCH 15 15 15 35
MTL 70 15 15 30
SPG 25 25 15 50
TUR 60 15 10 20

8 rows × 4 columns

### Price Data¶

The third data set provides price information for the packaged foods.

In [15]:
price ={'BEF' : ['Beef', 3.19],
'CHK' : ['Chicken', 2.59],
'FSH' : ['Fish', 2.29],
'HAM' : ['Ham', 2.89],
'MCH' : ['Macaroni & Cheese', 1.89],
'MTL' : ['Meat Loaf', 1.99],
'SPG' : ['Spaghetti', 1.99],
'TUR' : ['Turkey', 2.49]}

display(pd.DataFrame(price.values(), index = price.keys(), columns = ['food','price']))

food price
HAM Ham 2.89
MTL Meat Loaf 1.99
MCH Macaroni & Cheese 1.89
FSH Fish 2.29
TUR Turkey 2.49
CHK Chicken 2.59
BEF Beef 3.19
SPG Spaghetti 1.99

8 rows × 2 columns

### Problem Statement¶

Determine a shopping list that meets the nutrition requirements for minimum cost.

## A Basic Model¶

This first attempt at a model for the diet problem will encode the basic elements of the problem. We'll include a decision variable for each type of packaged good, and explicitly write out the objective function, and each of the nutrition requirements.

The following cell uses the script cell magic to run the MathProg model with the glpsol command line. This requires that you install the glpk package on the machine used to execute this notebook.

In [16]:
%%script glpsol -m /dev/stdin -o /dev/stdout

var xBEF >= 0;
var xCHK >= 0;
var xFSH >= 0;
var xHAM >= 0;
var xMCH >= 0;
var xMTL >= 0;
var xSPG >= 0;
var xTUR >= 0;

minimize Total_Cost:
3.19*xBEF + 2.59*xCHK + 2.29*xFSH + 2.89*xHAM
+ 1.89*xMCH + 1.99*xMTL + 1.99*xSPG + 2.49*xTUR;

subject to A:
700 <= 60*xBEF +  8*xCHK +  8*xFSH + 40*xHAM
+ 15*xMCH + 70*xMTL + 25*xSPG + 60*xTUR <= 10000;

subject to C:
700 <= 20*xBEF +  0*xCHK + 10*xFSH + 40*xHAM
+ 35*xMCH + 30*xMTL + 50*xSPG + 20*xTUR <= 10000;

subject to B1:
700 <= 10*xBEF + 20*xCHK + 15*xFSH + 35*xHAM
+ 15*xMCH + 15*xMTL + 25*xSPG + 15*xTUR <= 10000;

subject to B2:
700 <= 15*xBEF + 20*xCHK + 10*xFSH + 10*xHAM
+ 15*xMCH + 15*xMTL + 15*xSPG + 10*xTUR <= 10000;

end;

GLPSOL: GLPK LP/MIP Solver, v4.52
Parameter(s) specified in the command line:
-m /dev/stdin -o /dev/stdout
/dev/stdin:31: warning: final NL missing before end of file
Generating Total_Cost...
Generating A...
Generating C...
Generating B1...
Generating B2...
Model has been successfully generated
GLPK Simplex Optimizer, v4.52
5 rows, 8 columns, 39 non-zeros
Preprocessing...
4 rows, 8 columns, 31 non-zeros
Scaling...
A: min|aij| =  8.000e+00  max|aij| =  7.000e+01  ratio =  8.750e+00
GM: min|aij| =  5.017e-01  max|aij| =  1.993e+00  ratio =  3.972e+00
EQ: min|aij| =  2.549e-01  max|aij| =  1.000e+00  ratio =  3.922e+00
Constructing initial basis...
Size of triangular part is 4
0: obj =   0.000000000e+00  infeas =  2.845e+03 (0)
*     4: obj =   1.185625000e+02  infeas =  0.000e+00 (0)
*     7: obj =   8.820000000e+01  infeas =  0.000e+00 (0)
OPTIMAL LP SOLUTION FOUND
Time used:   0.0 secs
Memory used: 0.1 Mb (110559 bytes)
Writing basic solution to /dev/stdout'...
Problem:    stdin
Rows:       5
Columns:    8
Non-zeros:  39
Status:     OPTIMAL
Objective:  Total_Cost = 88.2 (MINimum)

No.   Row name   St   Activity     Lower bound   Upper bound    Marginal
------ ------------ -- ------------- ------------- ------------- -------------
1 Total_Cost   B           88.2
2 A            NL           700           700         10000    0.00181818
3 C            B        1633.33           700         10000
4 B1           B            700           700         10000
5 B2           NL           700           700         10000      0.124182

No. Column name  St   Activity     Lower bound   Upper bound    Marginal
------ ------------ -- ------------- ------------- ------------- -------------
1 xBEF         NL             0             0                     1.21818
2 xCHK         NL             0             0                   0.0918182
3 xFSH         NL             0             0                     1.03364
4 xHAM         NL             0             0                     1.57545
5 xMCH         B        46.6667             0
6 xMTL         B              0             0
7 xSPG         NL             0             0                   0.0818182
8 xTUR         NL             0             0                     1.13909

Karush-Kuhn-Tucker optimality conditions:

KKT.PE: max.abs.err = 1.14e-13 on row 2
max.rel.err = 8.11e-17 on row 2
High quality

KKT.PB: max.abs.err = 1.14e-13 on row 4
max.rel.err = 1.62e-16 on row 4
High quality

KKT.DE: max.abs.err = 2.22e-16 on column 1
max.rel.err = 6.00e-17 on column 8
High quality

KKT.DB: max.abs.err = 0.00e+00 on row 0
max.rel.err = 0.00e+00 on row 0
High quality

End of output


The previous cell used a 'cell magic' to run the model, with the results placed into a file output.txt. This next cell uses standard python commands to read and display the output file.

### Questions¶

• What is the total cost of What are we eating, and how much does it cost?
• Is this a realistic solution? How would like to modify the problem?
• What do the shadow costs (also called marginal costs, or sensitivities) mean?
• Can we apply this same model to other data sets?

## Extending the Model¶

### Introducing MathProg Sets¶

In this second version we introduce a set to represent all types of packaged foods. The use of a set allows us to declare all of the decision variables in one line. The decision variables now have subscripts associated with members from the set of foods.

In [17]:
%%script glpsol -m /dev/stdin -o /dev/stdout --out output

set FOODS := {'BEF','CHK','FSH','HAM','MCH','MTL','SPG','TUR'};

var x{FOODS} >= 0;

minimize Total_Cost:
3.19*x['BEF'] + 2.59*x['CHK'] + 2.29*x['FSH'] + 2.89*x['HAM']
+ 1.89*x['MCH'] + 1.99*x['MTL'] + 1.99*x['SPG'] + 2.49*x['TUR'];

subject to A:
700 <= 60*x['BEF'] +  8*x['CHK'] +  8*x['FSH'] + 40*x['HAM']
+ 15*x['MCH'] + 70*x['MTL'] + 25*x['SPG'] + 60*x['TUR'] <= 10000;

subject to C:
700 <= 20*x['BEF'] +  0*x['CHK'] + 10*x['FSH'] + 40*x['HAM']
+ 35*x['MCH'] + 30*x['MTL'] + 50*x['SPG'] + 20*x['TUR'] <= 10000;

subject to B1:
700 <= 10*x['BEF'] + 20*x['CHK'] + 15*x['FSH'] + 35*x['HAM']
+ 15*x['MCH'] + 15*x['MTL'] + 25*x['SPG'] + 15*x['TUR'] <= 10000;

subject to B2:
700 <= 15*x['BEF'] + 20*x['CHK'] + 10*x['FSH'] + 10*x['HAM']
+ 15*x['MCH'] + 15*x['MTL'] + 15*x['SPG'] + 10*x['TUR'] <= 10000;

end;


Next we introduce a set for the nutrients. Much of the problem data can be indexed by memebers of the set of nutrients, and this data is defined in the data section.

In [18]:
%%script glpsol -m /dev/stdin -o /dev/stdout --out output

set FOODS;
set NUTRS;

param price{FOODS} >= 0;
param n_min{NUTRS} >= 0;
param n_max{NUTRS} >= 0;

var x{FOODS} >= 0;

minimize Total_Cost: sum {f in FOODS} price[f]*x[f];

subject to A:
n_min['A'] <= 60*x['BEF'] +  8*x['CHK'] +  8*x['FSH'] + 40*x['HAM']
+ 15*x['MCH'] + 70*x['MTL'] + 25*x['SPG'] + 60*x['TUR'] <= n_max['A'];

subject to C:
n_min['C'] <= 20*x['BEF'] +  0*x['CHK'] + 10*x['FSH'] + 40*x['HAM']
+ 35*x['MCH'] + 30*x['MTL'] + 50*x['SPG'] + 20*x['TUR'] <= n_max['C'];

subject to B1:
n_min['B1'] <= 10*x['BEF'] + 20*x['CHK'] + 15*x['FSH'] + 35*x['HAM']
+ 15*x['MCH'] + 15*x['MTL'] + 25*x['SPG'] + 15*x['TUR'] <= n_max['B1'];

subject to B2:
n_min['B2'] <= 15*x['BEF'] + 20*x['CHK'] + 10*x['FSH'] + 10*x['HAM']
+ 15*x['MCH'] + 15*x['MTL'] + 15*x['SPG'] + 10*x['TUR'] <= n_max['B2'];

data;

param : FOODS : price :=
BEF  3.19
CHK  2.59
FSH  2.29
HAM  2.89
MCH  1.89
MTL  1.99
SPG  1.99
TUR  2.49;

param : NUTRS : n_min n_max :=
A    700   10000
C    700   10000
B1   700   10000
B2   700   10000;

end;


### Factoring the Model and Data¶

The factoring of the problem into separate model and data sections is completed by introducing a two dimensional data set for the nutrient content of the packaged foods. The parameter data is placed in the data section, and the model constraints written in terms of the indexed parameters.

In [19]:
%%script glpsol -m /dev/stdin -o /dev/stdout --out output

set FOODS;
set NUTRS;

param price{FOODS} >= 0;
param n_min{NUTRS} >= 0;
param n_max{NUTRS} >= 0;
param a{FOODS,NUTRS} >= 0;

var x{FOODS} >= 0;

minimize Total_Cost: sum {f in FOODS} price[f]*x[f];

subject to n_req {n in NUTRS}:
n_min[n] <= sum {f in FOODS} a[f,n] * x[f] <= n_max[n];

data;

param : FOODS : price :=
BEF  3.19
CHK  2.59
FSH  2.29
HAM  2.89
MCH  1.89
MTL  1.99
SPG  1.99
TUR  2.49;

param : NUTRS : n_min n_max :=
A    700   10000
C    700   10000
B1   700   10000
B2   700   10000;

param a :  A   C  B1  B2 :=
BEF   60  20  10  15
CHK    8   0  20  20
FSH    8  10  15  10
HAM   40  40  35  10
MCH   15  35  15  15
MTL   70  30  15  15
SPG   25  50  25  15
TUR   60  20  15  10;

end;


### Post-Processing Results¶

The output from glpk/MathProg can be enchanced by adding printf statements for selected data, or table statements for access to structured data indexed by sets.

In [20]:
%%script glpsol -m /dev/stdin -o output.txt -y display.txt --out output

set FOODS;
set NUTRS;

param price{FOODS} >= 0;
param x_min{FOODS} >= 0;
param x_max{FOODS} >= 0;

param n_min{NUTRS} >= 0;
param n_max{NUTRS} >= 0;

param a{FOODS,NUTRS} >= 0;

var x{f in FOODS} >= 0;

minimize Total_Cost: sum {f in FOODS} price[f]*x[f];

subject to n_req {n in NUTRS}:
n_min[n] <= sum {f in FOODS} a[f,n] * x[f] <= n_max[n];

solve;

printf "\nMinimum Cost Diet = %6.2f $/week.\n\n", Total_Cost; table results {f in FOODS} OUT "CSV" "output.csv" "Table": f ~ Food, x[f] ~ Quantity, price[f] ~ Price; data; param : FOODS : price := BEF 3.19 CHK 2.59 FSH 2.29 HAM 2.89 MCH 1.89 MTL 1.99 SPG 1.99 TUR 2.49; param : NUTRS : n_min n_max := A 700 10000 C 700 10000 B1 700 10000 B2 700 10000; param a : A C B1 B2 := BEF 60 20 10 15 CHK 8 0 20 20 FSH 8 10 15 10 HAM 40 40 35 10 MCH 15 35 15 15 MTL 70 30 15 15 SPG 25 50 25 15 TUR 60 20 15 10; end;  In [21]: print output  GLPSOL: GLPK LP/MIP Solver, v4.52 Parameter(s) specified in the command line: -m /dev/stdin -o output.txt -y display.txt Reading model section from /dev/stdin... Reading data section from /dev/stdin... /dev/stdin:58: warning: final NL missing before end of file 58 lines were read Generating Total_Cost... Generating n_req... Model has been successfully generated GLPK Simplex Optimizer, v4.52 5 rows, 8 columns, 39 non-zeros Preprocessing... 4 rows, 8 columns, 31 non-zeros Scaling... A: min|aij| = 8.000e+00 max|aij| = 7.000e+01 ratio = 8.750e+00 GM: min|aij| = 5.017e-01 max|aij| = 1.993e+00 ratio = 3.972e+00 EQ: min|aij| = 2.549e-01 max|aij| = 1.000e+00 ratio = 3.922e+00 Constructing initial basis... Size of triangular part is 4 0: obj = 0.000000000e+00 infeas = 2.845e+03 (0) * 4: obj = 1.185625000e+02 infeas = 0.000e+00 (0) * 7: obj = 8.820000000e+01 infeas = 0.000e+00 (0) OPTIMAL LP SOLUTION FOUND Time used: 0.0 secs Memory used: 0.1 Mb (135090 bytes) Writing results... Model has been successfully processed Writing basic solution to output.txt'...  In [22]: print(open('output.txt').read())  Problem: stdin Rows: 5 Columns: 8 Non-zeros: 39 Status: OPTIMAL Objective: Total_Cost = 88.2 (MINimum) No. Row name St Activity Lower bound Upper bound Marginal ------ ------------ -- ------------- ------------- ------------- ------------- 1 Total_Cost B 88.2 2 n_req[A] NL 700 700 10000 0.00181818 3 n_req[C] B 1633.33 700 10000 4 n_req[B1] B 700 700 10000 5 n_req[B2] NL 700 700 10000 0.124182 No. Column name St Activity Lower bound Upper bound Marginal ------ ------------ -- ------------- ------------- ------------- ------------- 1 x[BEF] NL 0 0 1.21818 2 x[CHK] NL 0 0 0.0918182 3 x[FSH] NL 0 0 1.03364 4 x[HAM] NL 0 0 1.57545 5 x[MCH] B 46.6667 0 6 x[MTL] B 0 0 7 x[SPG] NL 0 0 0.0818182 8 x[TUR] NL 0 0 1.13909 Karush-Kuhn-Tucker optimality conditions: KKT.PE: max.abs.err = 1.14e-13 on row 2 max.rel.err = 8.11e-17 on row 2 High quality KKT.PB: max.abs.err = 1.14e-13 on row 4 max.rel.err = 1.62e-16 on row 4 High quality KKT.DE: max.abs.err = 2.22e-16 on column 1 max.rel.err = 6.00e-17 on column 8 High quality KKT.DB: max.abs.err = 0.00e+00 on row 0 max.rel.err = 0.00e+00 on row 0 High quality End of output  In [23]: print(open('display.txt').read())  Minimum Cost Diet = 88.20$/week.


In [24]:
import pandas
display(df)
df['Quantity'].plot(kind='bar').set_xticklabels(df['Food']);

Food Quantity Price
0 BEF 0.000000 3.19
1 CHK 0.000000 2.59
2 FSH 0.000000 2.29
3 HAM 0.000000 2.89
4 MCH 46.666667 1.89
5 MTL 0.000000 1.99
6 SPG 0.000000 1.99
7 TUR 0.000000 2.49

8 rows × 3 columns

• Force diversification by imposing bounds on the minimum and maximum number of packages.
• Constraining the solution to integer values
In [25]:
%%script glpsol -m /dev/stdin -o display.txt --out output

set FOODS;
set NUTRS;

param price{FOODS} >= 0;
param x_min{FOODS} >= 0;
param x_max{FOODS} >= 0;

param n_min{NUTRS} >= 0;
param n_max{NUTRS} >= 0;

param a{FOODS,NUTRS} >= 0;

var x{f in FOODS} integer >= x_min[f], <= x_max[f];

minimize Total_Cost: sum {f in FOODS} price[f]*x[f];

subject to n_req {n in NUTRS}:
n_min[n] <= sum {f in FOODS} a[f,n] * x[f] <= n_max[n];

solve;

printf "\nMinimum Cost Diet = %6.2f $/week.\n\n", Total_Cost; table results {f in FOODS} OUT "CSV" "output.csv" "Table": f ~ Food, x[f] ~ Quantity, price[f] ~ Price; data; param : FOODS : price x_min x_max := BEF 3.19 2 30 CHK 2.59 2 30 FSH 2.29 2 30 HAM 2.89 2 30 MCH 1.89 2 30 MTL 1.99 2 30 SPG 1.99 2 30 TUR 2.49 2 30; param : NUTRS : n_min n_max := A 700 10000 C 700 10000 B1 700 10000 B2 700 10000 NA 0 40000 CAL 16000 24000; param a : A C B1 B2 NA CAL := BEF 60 20 10 15 938 295 CHK 8 0 20 20 2180 770 FSH 8 10 15 10 945 440 HAM 40 40 35 10 278 430 MCH 15 35 15 15 1182 315 MTL 70 30 15 15 896 400 SPG 25 50 25 15 1329 370 TUR 60 20 15 10 1397 450; end;  In [26]: print output  GLPSOL: GLPK LP/MIP Solver, v4.52 Parameter(s) specified in the command line: -m /dev/stdin -o display.txt Reading model section from /dev/stdin... Reading data section from /dev/stdin... /dev/stdin:60: warning: final NL missing before end of file 60 lines were read Generating Total_Cost... Generating n_req... Model has been successfully generated GLPK Integer Optimizer, v4.52 7 rows, 8 columns, 55 non-zeros 8 integer variables, none of which are binary Preprocessing... 6 rows, 8 columns, 47 non-zeros 8 integer variables, none of which are binary Scaling... A: min|aij| = 8.000e+00 max|aij| = 2.180e+03 ratio = 2.725e+02 GM: min|aij| = 3.996e-01 max|aij| = 2.502e+00 ratio = 6.262e+00 EQ: min|aij| = 1.597e-01 max|aij| = 1.000e+00 ratio = 6.262e+00 2N: min|aij| = 1.172e-01 max|aij| = 1.250e+00 ratio = 1.067e+01 Constructing initial basis... Size of triangular part is 6 Solving LP relaxation... GLPK Simplex Optimizer, v4.52 6 rows, 8 columns, 47 non-zeros 0: obj = 3.864000000e+01 infeas = 5.163e+01 (0) * 10: obj = 1.628185401e+02 infeas = 0.000e+00 (0) * 13: obj = 1.364032777e+02 infeas = 0.000e+00 (0) OPTIMAL LP SOLUTION FOUND Integer optimization begins... + 13: mip = not found yet >= -inf (1; 0) Solution found by heuristic: 138.94 + 36: mip = 1.389400000e+02 >= tree is empty 0.0% (0; 15) INTEGER OPTIMAL SOLUTION FOUND Time used: 0.0 secs Memory used: 0.1 Mb (148951 bytes) Minimum Cost Diet = 138.94$/week.

Writing results...
Model has been successfully processed
Writing MIP solution to display.txt'...


In [27]:
print(open('display.txt').read())

Problem:    stdin
Rows:       7
Columns:    8 (8 integer, 0 binary)
Non-zeros:  55
Status:     INTEGER OPTIMAL
Objective:  Total_Cost = 138.94 (MINimum)

No.   Row name        Activity     Lower bound   Upper bound
------ ------------    ------------- ------------- -------------
1 Total_Cost             138.94
2 n_req[A]                 2682           700         10000
3 n_req[C]                 1840           700         10000
4 n_req[B1]                1360           700         10000
5 n_req[B2]                 705           700         10000
6 n_req[NA]               39916             0         40000
7 n_req[CAL]              23630         16000         24000

No. Column name       Activity     Lower bound   Upper bound
------ ------------    ------------- ------------- -------------
1 x[BEF]       *              2             2            30
2 x[CHK]       *              2             2            30
3 x[FSH]       *              2             2            30
4 x[HAM]       *             25             2            30
5 x[MCH]       *              2             2            30
6 x[MTL]       *             19             2            30
7 x[SPG]       *              2             2            30
8 x[TUR]       *              2             2            30

Integer feasibility conditions:

KKT.PE: max.abs.err = 0.00e+00 on row 0
max.rel.err = 0.00e+00 on row 0
High quality

KKT.PB: max.abs.err = 0.00e+00 on row 0
max.rel.err = 0.00e+00 on row 0
High quality

End of output


In [28]:
import pandas
display(df)
df['Quantity'].plot(kind='bar').set_xticklabels(df['Food']);
`
Food Quantity Price
0 BEF 2 3.19
1 CHK 2 2.59
2 FSH 2 2.29
3 HAM 25 2.89
4 MCH 2 1.89
5 MTL 19 1.99
6 SPG 2 1.99
7 TUR 2 2.49

8 rows × 3 columns

## Exercises¶

1. Lowering the upper limit on the number of packages that can be purchased will eventually lead to an infeasible solution. Deliberately modify the constraints so that you see this behavior, and learn how to identify these situations.

2. Alter the table output to include the marginal values of the nutrient constraints. What nutrients have the highest marginal values?