The latest version of this IPython notebook is available at http://github.com/jckantor/ESTM60203 for noncommercial use under terms of the Creative Commons Attribution Noncommericial ShareAlike License (CC BY-NC-SA 4.0).

J.C. Kantor ([email protected])

Getting Started with GNU MathProg

This IPython notebook describes the installation and use of GLPK/MathProg from an IPython notebook.

Initializations
In [1]:
from IPython.core.display import HTML
HTML(open("styles/custom.css", "r").read())
Out[1]:

GNU MathProg is a Mathematical Programming Language

Mathematical programming languages are designed for the purpose of writing applied optimization problems in a concise, high-level, maintainable manner with sufficient precision to be translated and solved using optimization software.

GAMS (the General Algebraic Modeling System) is generally cited as the first example of a commercially successful mathematical programming language. Since the introduction of GAMS in the late 1970's, a number of succesful languages have been produced including AIMMS, AMPL, LINDO/LINGO, MPL, XPRESS-MOSEL, and many others. Of these, AMPL appears to be most widely adopted language for university training.

GNU MathProg) is part of the open source GNU GLPK project. MathProg offers a subset of the AMPL language roughly equivalent to AMPL as it distributed in the early 1990's and described in the AMPL book. Though distributed as part of GLPK, MathProg interfaces are available for other solvers including lpsolve and COIN-OR CBC. MathProg can export models in a several formats compatiable with most commercial and non-commericial solvers for mixed-integer linear programs.

Installing GLPK/MathProg

You will need to install a working copy of GLPK before executing code cells in the following tutorial. Here some basic recommendations.

  1. For Windows/PC hardware, the Windows for GLPK web site maintains a pre-compiled version of GLPK based on the lastest official release.

  2. For MacOS users, the most convenient installation process is to use a package manager. If you are not already doing so, you may consider installing the excellect Homebrew package manager using the instructions on their homepage. Once Homebrew is installed, GLPK can be installed with two commands

  brew tap homebrew/science
  brew install glpk

If GLPK has been installed correctly on your machine, you should be able to execute the following command. Test this before going further with this tutorial.

In [1]:
print "Hello World"
Hello World
In [2]:
%%script glpsol -m /dev/stdin -o /dev/stdout --out output

printf "Hello, World\n";
end;
In [3]:
print output
GLPSOL: GLPK LP/MIP Solver, v4.52
Parameter(s) specified in the command line:
 -m /dev/stdin -o /dev/stdout
Reading model section from /dev/stdin...
/dev/stdin:3: warning: final NL missing before end of file
3 lines were read
Hello, World
Model has been successfully generated
GLPK Simplex Optimizer, v4.52
0 rows, 0 columns, 0 non-zeros
~     0: obj =   0.000000000e+00  infeas =  0.000e+00
OPTIMAL SOLUTION FOUND
Time used:   0.0 secs
Memory used: 0.0 Mb (41523 bytes)
Writing basic solution to `/dev/stdout'...
Problem:    stdin
Rows:       0
Columns:    0
Non-zeros:  0
Status:     OPTIMAL
Objective:  0 (MINimum)

   No.   Row name   St   Activity     Lower bound   Upper bound    Marginal
------ ------------ -- ------------- ------------- ------------- -------------

   No. Column name  St   Activity     Lower bound   Upper bound    Marginal
------ ------------ -- ------------- ------------- ------------- -------------

Karush-Kuhn-Tucker optimality conditions:

KKT.PE: max.abs.err = 0.00e+00 on row 0
        max.rel.err = 0.00e+00 on row 0
        High quality

KKT.PB: max.abs.err = 0.00e+00 on row 0
        max.rel.err = 0.00e+00 on row 0
        High quality

KKT.DE: max.abs.err = 0.00e+00 on column 0
        max.rel.err = 0.00e+00 on column 0
        High quality

KKT.DB: max.abs.err = 0.00e+00 on row 0
        max.rel.err = 0.00e+00 on row 0
        High quality

End of output

Using GLPK/MathProg in IPython Notebooks

Cell magics provide mechanisms for using GLPK/MathProg inside the cells of an IPython notebook.

To process a MathProg model from a cell, use the cell magic

%%script glpsol -m /dev/stdin

as the first line of the cell. This line uses the cell magic %%script to run the command glpsol -m /dev/stdin as if it were entered directly into a terminal window. glpsol calls the glpk solver. The argument -m /dev/stdin tells the solver to process a MathProg model from the standard input. In the case of IPython notebooks, is the remaining contents of the cell.

We'll demonstrate this for a simple model that adds two parameter values and displays the result. Look for the displayed result among the other output generated by the glpsol command.

In [4]:
%%script glpsol -m /dev/stdin

param a := 12.3;
param b := 13;

display a + b;

end;
GLPSOL: GLPK LP/MIP Solver, v4.52
Parameter(s) specified in the command line:
 -m /dev/stdin
Reading model section from /dev/stdin...
/dev/stdin:7: warning: final NL missing before end of file
7 lines were read
Display statement at line 5
25.3
Model has been successfully generated
GLPK Simplex Optimizer, v4.52
0 rows, 0 columns, 0 non-zeros
~     0: obj =   0.000000000e+00  infeas =  0.000e+00
OPTIMAL SOLUTION FOUND
Time used:   0.0 secs
Memory used: 0.0 Mb (49555 bytes)

Often you will want to separate the display output of the MathProg model from other output generated by glpsol.

In [5]:
%%script --out output glpsol -m /dev/stdin -y out.txt 

param a := 12.3;
param b := 13;

display a + b;

end;

The -y out.txt option redirects display output to the file out.txt. The file can be read and displayed using the usual python methods as shown here:

In [6]:
f = open('out.txt')
print f.read()
f.close()
Display statement at line 5
25.3

The --out output is an option passed to script that redirects normal cell output to a python variable output. This will contain the remaining output of the glpsol command which can be displayed as follows:

In [7]:
print output
GLPSOL: GLPK LP/MIP Solver, v4.52
Parameter(s) specified in the command line:
 -m /dev/stdin -y out.txt
Reading model section from /dev/stdin...
/dev/stdin:7: warning: final NL missing before end of file
7 lines were read
Model has been successfully generated
GLPK Simplex Optimizer, v4.52
0 rows, 0 columns, 0 non-zeros
~     0: obj =   0.000000000e+00  infeas =  0.000e+00
OPTIMAL SOLUTION FOUND
Time used:   0.0 secs
Memory used: 0.0 Mb (49602 bytes)

Solving Linear Equations with MathProg

GLPK/MathProg will find feasible solutions to a system of linear equations. The basic steps necessary to describe and solve a system of linear equations are demonstrated in the next example.

In [8]:
%%script --out output glpsol -m /dev/stdin

# declare problem variables
var x;
var y;
var z;

# list all equations
eqn1 : 3*x + 2*y + z = 12;
eqn2 : 2.1*x + y = -3;
eqn3 : y - z = 4;

# solve
solve;

# display results
display x, y, z;

end;

A few things to notice are that all unknowns must be declared as variables, and that all equations are written with a unique name followed by the equation itself.

First we'll look at the diagnostic output generated by glpsol.

In [9]:
print output
GLPSOL: GLPK LP/MIP Solver, v4.52
Parameter(s) specified in the command line:
 -m /dev/stdin
Reading model section from /dev/stdin...
/dev/stdin:18: warning: final NL missing before end of file
18 lines were read
Generating eqn1...
Generating eqn2...
Generating eqn3...
Model has been successfully generated
GLPK Simplex Optimizer, v4.52
3 rows, 3 columns, 7 non-zeros
Preprocessing...
3 rows, 3 columns, 7 non-zeros
Scaling...
 A: min|aij| =  1.000e+00  max|aij| =  3.000e+00  ratio =  3.000e+00
Problem data seem to be well scaled
Constructing initial basis...
Size of triangular part is 2
      0: obj =   0.000000000e+00  infeas =  1.420e+01 (1)
*     1: obj =   0.000000000e+00  infeas =  0.000e+00 (0)
OPTIMAL LP SOLUTION FOUND
Time used:   0.0 secs
Memory used: 0.1 Mb (94191 bytes)
Display statement at line 16
x.val = -7.57575757575758
y.val = 12.9090909090909
z.val = 8.90909090909091
Model has been successfully processed

If things went well we should see a line OPTIMAL LP SOLUTION FOUND along with additional information that is useful in tuning larger models for efficient solution. If things didn't go well, an appropriate message will be displayed indicating an error in processing the model description, or problems in finding a numerical solution.

In this case glpsol says an optimal solution was found. The next step is to show the displayed results that were written to the file out.txt.

In [10]:
f = open('out.txt');
print(f.read())
f.close()
Display statement at line 5
25.3

The extra .val appended to each variable name indicates that we are looking at value of the variable found by the solver. Also associated with each variable is a .dual value that will be useful in analyzing the results of linear optimization problems.

Reading Data from .csv Files

In [11]:
%%writefile input.csv
Name, Age
Abigail, 22.1
Brent, 24.1
Carla, 21.0
Doug, 20.0
Overwriting input.csv
In [12]:
%%script --out output glpsol -m /dev/stdin

set NAMES;

param Age{NAMES};

table tin IN "CSV" "input.csv" : NAMES <- [Name], Age;
    
for {n in NAMES}: printf "%s\n", n;
    
end;
In [13]:
print output
GLPSOL: GLPK LP/MIP Solver, v4.52
Parameter(s) specified in the command line:
 -m /dev/stdin
Reading model section from /dev/stdin...
/dev/stdin:10: warning: final NL missing before end of file
10 lines were read
Reading tin...
/dev/stdin:6: field Age missing in input table
MathProg model processing error

In [14]:
import pandas
pandas.read_csv("input.csv")
Out[14]:
Name Age
0 Abigail 22.1
1 Brent 24.1
2 Carla 21.0
3 Doug 20.0

Writing MathProg Output to .csv Files

In [15]:
%%script --out output glpsol -m /dev/stdin

# declare problem variables
var x;
var y;
var z;

# list all equations
eqn1 : 3*x + 2*y + z = 12;
eqn2 : 2.1*x + y = -3;
eqn3 : y - z = 4;

# solve
solve;

# output results to .csv file
table result {1..1} OUT "CSV" "out.csv" : x, y, z;
    
end;

First check the diagnostic output to verify that no errors were encountered.

In [16]:
print output
GLPSOL: GLPK LP/MIP Solver, v4.52
Parameter(s) specified in the command line:
 -m /dev/stdin
Reading model section from /dev/stdin...
/dev/stdin:18: warning: final NL missing before end of file
18 lines were read
Generating eqn1...
Generating eqn2...
Generating eqn3...
Model has been successfully generated
GLPK Simplex Optimizer, v4.52
3 rows, 3 columns, 7 non-zeros
Preprocessing...
3 rows, 3 columns, 7 non-zeros
Scaling...
 A: min|aij| =  1.000e+00  max|aij| =  3.000e+00  ratio =  3.000e+00
Problem data seem to be well scaled
Constructing initial basis...
Size of triangular part is 2
      0: obj =   0.000000000e+00  infeas =  1.420e+01 (1)
*     1: obj =   0.000000000e+00  infeas =  0.000e+00 (0)
OPTIMAL LP SOLUTION FOUND
Time used:   0.0 secs
Memory used: 0.1 Mb (94191 bytes)
Writing result...
Model has been successfully processed

The result of the table command is a .csv file that can be read by virtually any spreadsheet or data analysis software. Let's first verify the format

In [17]:
f = open('out.csv');
print(f.read())
f.close()
x,y,z
-7.57575757575758,12.9090909090909,8.90909090909091

As an example of it's use, here's how the file could be read as a pandas DataFrame and the results plotted.

In [18]:
import pandas
df = pandas.read_csv("out.csv");
display(df)
df.plot(kind='bar')
x y z
0 -7.575758 12.909091 8.909091
Out[18]:
<matplotlib.axes.AxesSubplot at 0x109aa1f10>
In [19]:
%%writefile input.csv
A,B,C
12.2, 13.1, 13.2
Overwriting input.csv
In [20]:
%%script --out output glpsol -m /dev/stdin

set S;

param a;
param b;
param c;

table input IN "CSV" "input.csv": S <- [A], a~A, b~B, c~C;
    
end;
In [21]:
print output
GLPSOL: GLPK LP/MIP Solver, v4.52
Parameter(s) specified in the command line:
 -m /dev/stdin
Reading model section from /dev/stdin...
/dev/stdin:8: a must have 1 subscript rather than 0
Context: ...am b ; param c ; table input IN '...' '...' : S <- [ A ] , a
MathProg model processing error