Module biogeme.optimization

Examples of use of each function

This webpage is for programmers who need examples of use of the functions of the class. The examples are designed to illustrate the syntax. They do not correspond to any meaningful model. For examples of models, visit biogeme.epfl.ch.

In [1]:
import datetime
print(datetime.datetime.now())
2020-06-03 09:58:19.180367
In [2]:
import biogeme.version as ver
print(ver.getText())
biogeme 3.2.6 [2020-06-03]
Version entirely written in Python
Home page: http://biogeme.epfl.ch
Submit questions to https://groups.google.com/d/forum/biogeme
Michel Bierlaire, Transport and Mobility Laboratory, Ecole Polytechnique Fédérale de Lausanne (EPFL)

In [3]:
import numpy as np
import pandas as pd
In [4]:
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
In [5]:
import biogeme.optimization as opt
import biogeme.biogeme as bio
import biogeme.database as db
import biogeme.models as models
from biogeme.expressions import Beta, Variable

Defne the verbosity of Biogeme

In [6]:
import biogeme.messaging as msg
logger = msg.bioMessage()
logger.setSilent()
#logger.setDetailed()
#logger.setDebug()

Modified Cholesky factorization by Schnabel and Eskow (1999)

Example by Eskow and Schnabel, 1991

In [7]:
A = np.array([[ 0.3571, -0.1030,  0.0274, -0.0459],
              [-0.1030,  0.2525,  0.0736, -0.3845],
              [ 0.0274,  0.0736,  0.2340, -0.2878],
              [-0.0459, -0.3845, -0.2878,  0.5549]])
A
Out[7]:
array([[ 0.3571, -0.103 ,  0.0274, -0.0459],
       [-0.103 ,  0.2525,  0.0736, -0.3845],
       [ 0.0274,  0.0736,  0.234 , -0.2878],
       [-0.0459, -0.3845, -0.2878,  0.5549]])
In [8]:
L, E, P = opt.schnabelEskow(A)
L
Out[8]:
array([[ 0.7449161 ,  0.        ,  0.        ,  0.        ],
       [-0.06161768,  0.59439319,  0.        ,  0.        ],
       [-0.38635223,  0.00604629,  0.46941286,  0.        ],
       [-0.51616551, -0.22679419, -0.26511936,  0.00152451]])

The factor $L$ is such that $A + E = PLL^TP^T$. Therefore, the expression below should be the null matrix.

In [9]:
P @ L @ L.T @ P.T - E - A
Out[9]:
array([[-5.55111512e-17, -1.38777878e-17,  0.00000000e+00,
         0.00000000e+00],
       [-1.38777878e-17,  5.55111512e-17,  0.00000000e+00,
         0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         0.00000000e+00]])

Example by Schnabel and Eskow (1999)

In [10]:
A2 = np.array([[ 1890.3, -1705.6, -315.8,  3000.3],
               [-1705.6,  1538.3,  284.9, -2706.6],
               [ -315.8,   284.9,   52.5,  -501.2],
               [ 3000.3, -2706.6, -501.2,  4760.8]])
A2
Out[10]:
array([[ 1890.3, -1705.6,  -315.8,  3000.3],
       [-1705.6,  1538.3,   284.9, -2706.6],
       [ -315.8,   284.9,    52.5,  -501.2],
       [ 3000.3, -2706.6,  -501.2,  4760.8]])
In [11]:
L, E, P = opt.schnabelEskow(A2)
L
Out[11]:
array([[ 6.89985507e+01,  0.00000000e+00,  0.00000000e+00,
         0.00000000e+00],
       [-7.26392069e+00,  3.19413321e-01,  0.00000000e+00,
         0.00000000e+00],
       [-3.92269109e+01, -1.28891153e-01,  4.44731720e-01,
         0.00000000e+00],
       [ 4.34835220e+01,  1.90522168e-01,  3.34584739e-01,
         1.71484739e-03]])

The factor $L$ is such that $A + E = PLL^TP^T$. Therefore, the expression below should be the null matrix.

In [12]:
P @ L @ L.T @ P.T - E - A2
Out[12]:
array([[ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
        -9.09494702e-13]])

Rosenbrock problem

In [13]:
class rosenbrock(opt.functionToMinimize):
    def __init__(self):
        self.x = None
        
    def setVariables(self, x):
        self.x = x

    def f(self, batch=None):
        if batch is not None:
            raise excep.biogemeError('This function is not '
                                     'data driven.')
        n = len(self.x)
        f = sum(100.0 * (self.x[i + 1]-self.x[i]**2)**2
                + (1.0 - self.x[i])**2 for i in range(n - 1))
        return f

    def g(self):
        n = len(self.x)
        g = np.zeros(n)
        for i in range(n - 1):
            g[i] = g[i] - 400 * self.x[i] * \
                (self.x[i + 1] -self.x[i]**2) - \
                2 * (1 - self.x[i])
            g[i + 1] = g[i + 1] + \
                200 * (self.x[i + 1] - self.x[i]**2)
        return g
    
    def h(self):
        n = len(self.x)
        H = np.zeros((n, n))
        for i in range(n - 1):
            H[i][i] = H[i][i] - 400 * self.x[i + 1] \
                + 1200 * self.x[i]**2 + 2
            H[i+1][i] = H[i+1][i] - 400 * self.x[i]
            H[i][i+1] = H[i][i+1] - 400 * self.x[i]
            H[i+1][i+1] = H[i+1][i+1] + 200
        return H 

    def f_g(self, batch=None):
        if batch is not None:
            raise excep.biogemeError('This function is not '
                                     'data driven.')
        return self.f(), self.g()

    def f_g_h(self, batch=None):
        if batch is not None:
            raise excep.biogemeError('This function is not '
                                     'data driven.')
        return self.f(), self.g(), self.h()
    
    def f_g_bhhh(self, batch=None):
        raise excep.biogemeError('This function is not data driven.')

theFunction = rosenbrock()

Example 5.8 from Bierlaire (2015)

There is an infinite number of local optima. For each integer $k$,

$$x^* = ((-1)^{k+1}, k \pi) $$

is a local optimum.

In [14]:
class example58(opt.functionToMinimize):
    def __init__(self):
        self.x = None
        
    def setVariables(self, x):
        self.x = x

    def f(self, batch=None):
        if batch is not None:
            raise excep.biogemeError('This function is not '
                                     'data driven.')
        n = len(self.x)
        f = 0.5 * self.x[0] * self.x[0] + \
            self.x[0] * np.cos(self.x[1])
        return f

    def g(self):
        n = len(self.x)
        g = np.array([self.x[0] + np.cos(self.x[1]),
                      -self.x[0]*np.sin(self.x[1])])
        return g

    def h(self):
        n = len(self.x)
        H = np.array([[1, 
                       -np.sin(self.x[1])],
                      [-np.sin(self.x[1]),
                       -self.x[0] * np.cos(self.x[1])]])
        return H 

    def f_g(self, batch=None):
        if batch is not None:
            raise excep.biogemeError('This function is not '
                                     'data driven.')
        return self.f(), self.g()

    def f_g_h(self, batch=None):
        if batch is not None:
            raise excep.biogemeError('This function is not '
                                     'data driven.')
        return self.f(), self.g(), self.h()
    
    def f_g_bhhh(self, batch=None):
        raise excep.biogemeError('This function is not data driven.')

ex58 = example58()

Line search algorithm

In [15]:
x = np.array([-1.5, 1.5])
theFunction.setVariables(x)
f, g = theFunction.f_g()
alpha, nfev = opt.lineSearch(theFunction, x, f, g, -g)
print(f"alpha={alpha} nfev={nfev}")
alpha=0.0009765625 nfev=12

Newton with linesearch

Rosenbrock

In [16]:
x0 = np.array([-1.5, 1.5])
xstar, messages = opt.newtonLineSearch(theFunction, x0)
xstar
Out[16]:
array([0.99999961, 0.99999918])
In [17]:
for k, v in messages.items():
    print(f'{k}:\t{v}')
Algorithm:	Unconstrained Newton with line search
Relative gradient:	2.0646130808264894e-07
Number of iterations:	22
Number of function evaluations:	77
Number of gradient evaluations:	77
Number of hessian evaluations:	23
Cause of termination:	Relative gradient = 2.1e-07 <= 6.1e-06

Example 5.8

In [18]:
x0 = np.array([1, 1])
xstar, messages = opt.newtonLineSearch(ex58, x0)
xstar
Out[18]:
array([1.        , 3.14159265])
In [19]:
for k, v in messages.items():
    print(f'{k}:\t{v}')
Algorithm:	Unconstrained Newton with line search
Relative gradient:	2.117211853102874e-10
Number of iterations:	5
Number of function evaluations:	28
Number of gradient evaluations:	28
Number of hessian evaluations:	6
Cause of termination:	Relative gradient = 2.1e-10 <= 6.1e-06

Newton with trust region

Rosenbrock

In [20]:
x0 = np.array([-1.5, 1.5])
xstar, messages = opt.newtonTrustRegion(theFunction, x0)
xstar
Out[20]:
array([0.99999995, 0.99999989])
In [21]:
for k, v in messages.items():
    print(f'{k}:\t{v}')
Algorithm:	Unconstrained Newton with trust region
Relative gradient:	9.752977092603033e-10
Cause of termination:	Relative gradient = 9.8e-10 <= 6.1e-06
Number of iterations:	28
Number of function evaluations:	49
Number of gradient evaluations:	22
Number of hessian evaluations:	22

Example 5.8

In [22]:
x0 = np.array([1.0, 1.0])
xstar, messages = opt.newtonTrustRegion(ex58, x0)
xstar
Out[22]:
array([-1.00000000e+00, -1.56439954e-09])
In [23]:
for k, v in messages.items():
    print(f'{k}:\t{v}')
Algorithm:	Unconstrained Newton with trust region
Relative gradient:	1.5037932097369974e-09
Cause of termination:	Relative gradient = 1.5e-09 <= 6.1e-06
Number of iterations:	5
Number of function evaluations:	10
Number of gradient evaluations:	6
Number of hessian evaluations:	6

BFGS and linesearch

Rosenbrock

In [24]:
x0 = np.array([-1.5, 1.5])
xstar, messages = opt.bfgsLineSearch(theFunction, x0, maxiter=10000)
xstar
Out[24]:
array([0.99999897, 0.99999797])
In [25]:
for k, v in messages.items():
    print(f'{k}:\t{v}')
Algorithm:	Inverse BFGS with line search
Relative gradient:	1.5315863397701923e-07
Cause of termination:	Relative gradient = 1.5e-07 <= 6.1e-06
Number of iterations:	32
Number of function evaluations:	152
Number of gradient evaluations:	33

Example 5.8

In [26]:
x0 = np.array([1, 1])
xstar, messages = opt.bfgsLineSearch(ex58, x0, maxiter=10000)
xstar
Out[26]:
array([-1.00000142e+00,  3.57636862e-06])
In [27]:
for k, v in messages.items():
    print(f'{k}:\t{v}')
Algorithm:	Inverse BFGS with line search
Relative gradient:	3.4378215567897343e-06
Cause of termination:	Relative gradient = 3.4e-06 <= 6.1e-06
Number of iterations:	10
Number of function evaluations:	48
Number of gradient evaluations:	11

BFGS and trust region

Rosenbrock

In [28]:
x0 = np.array([-1.5, 1.5])
xstar, messages = opt.bfgsTrustRegion(theFunction, x0, maxiter=10000)
xstar
Out[28]:
array([1.00000052, 1.00000078])
In [29]:
for k, v in messages.items():
    print(f'{k}:\t{v}')
Algorithm:	BFGS with trust region
Relative gradient:	1.7562432530828692e-06
Cause of termination:	Relative gradient = 1.8e-06 <= 6.1e-06
Number of iterations:	50
Number of function evaluations:	88
Number of gradient evaluations:	38

Example 5.8

In [30]:
x0 = np.array([1, 1])
xstar, messages = opt.bfgsTrustRegion(ex58, x0, maxiter=10000)
xstar
Out[30]:
array([-9.99999972e-01,  1.58776353e-08])
In [31]:
for k, v in messages.items():
    print(f'{k}:\t{v}')
Algorithm:	BFGS with trust region
Relative gradient:	2.7200302000474536e-08
Cause of termination:	Relative gradient = 2.7e-08 <= 6.1e-06
Number of iterations:	17
Number of function evaluations:	30
Number of gradient evaluations:	13

Newton for simple bounds

Rosenbrock

The bound constraints do not exclude the unconstrained optimum.

In [32]:
x0 = np.array([-1.5, 1.5])
theBounds = opt.bioBounds([(-1000, 1000), (-1000, 1000)])
xstar, messages = opt.simpleBoundsNewtonAlgorithm(theFunction, 
                                                  theBounds,
                                                  x0)
xstar
Out[32]:
array([0.99999961, 0.99999918])
In [33]:
for k, v in messages.items():
    print(f'{k}:\t{v}')
Algorithm:	Newton with trust region for simple bound constraints
Proportion analytical hessian:	100.0%
Relative projected gradient:	2.06461307367789e-07
Number of iterations:	32
Number of function evaluations:	77
Number of gradient evaluations:	23
Number of hessian evaluations:	23
Cause of termination:	Relative gradient = 2.1e-07 <= 6.1e-06

Here, the bound constraints do exclude the unconstrained optimum.

In [34]:
x0 = np.array([-1.5, 1.5])
theBounds = opt.bioBounds([(-1000, 0), (1, 1000)])
xstar, messages = opt.simpleBoundsNewtonAlgorithm(theFunction, 
                                                  theBounds,
                                                  x0)
xstar
Out[34]:
array([-0.99497475,  1.        ])
In [35]:
for k, v in messages.items():
    print(f'{k}:\t{v}')
Algorithm:	Newton with trust region for simple bound constraints
Proportion analytical hessian:	100.0%
Relative projected gradient:	7.676708776216401e-09
Number of iterations:	7
Number of function evaluations:	20
Number of gradient evaluations:	7
Number of hessian evaluations:	7
Cause of termination:	Relative gradient = 7.7e-09 <= 6.1e-06

Example 5.8

The bound constraints do not exclude the unconstrained optimum.

In [36]:
x0 = np.array([1, 1])
theBounds = opt.bioBounds([(-1000, 1000), (-1000, 1000)])
xstar, messages = opt.simpleBoundsNewtonAlgorithm(ex58, 
                                                  theBounds,
                                                  x0)
xstar
Out[36]:
array([1.        , 3.14159265])
In [37]:
for k, v in messages.items():
    print(f'{k}:\t{v}')
Algorithm:	Newton with trust region for simple bound constraints
Proportion analytical hessian:	100.0%
Relative projected gradient:	2.389957069442179e-11
Number of iterations:	6
Number of function evaluations:	19
Number of gradient evaluations:	7
Number of hessian evaluations:	7
Cause of termination:	Relative gradient = 2.4e-11 <= 6.1e-06

The bound constraints do exclude the unconstrained optimum.

In [38]:
x0 = np.array([-1, 0])
theBounds = opt.bioBounds([(-1000, 0), (1, 1000)])
xstar, messages = opt.simpleBoundsNewtonAlgorithm(ex58, 
                                                  theBounds,
                                                  x0)
xstar
Out[38]:
array([-0.54030231,  1.        ])
In [39]:
for k, v in messages.items():
    print(f'{k}:\t{v}')
Algorithm:	Newton with trust region for simple bound constraints
Proportion analytical hessian:	100.0%
Relative projected gradient:	8.881784197001252e-16
Number of iterations:	1
Number of function evaluations:	4
Number of gradient evaluations:	2
Number of hessian evaluations:	2
Cause of termination:	Relative gradient = 8.9e-16 <= 6.1e-06

Biogeme example

In [40]:
df = pd.DataFrame({'Person': [1, 1, 1, 2, 2],
                   'Exclude': [0, 0, 1, 0, 1],
                   'Variable1': [1, 2, 3, 4, 5],
                   'Variable2': [10, 20, 30,40, 50],
                   'Choice': [1, 2, 3, 1, 2],
                   'Av1': [0, 1, 1, 1, 1],
                   'Av2': [1, 1, 1, 1, 1],
                   'Av3': [0, 1, 1, 1, 1]})
myData = db.Database('test', df)

Choice = Variable('Choice')
Variable1 = Variable('Variable1')
Variable2 = Variable('Variable2')
beta1 = Beta('beta1', 0, None, None, 0)
beta2 = Beta('beta2', 0, None, None, 0)
V1 = beta1 * Variable1
V2 = beta2 * Variable2
V3 = 0
V ={1: V1,2: V2,3: V3}

likelihood = models.loglogit(V, av=None, i=Choice)
myBiogeme = bio.BIOGEME(myData, likelihood)
myBiogeme.modelName = 'simpleExample'
print(myBiogeme)
simpleExample: database [test]{'loglike': _bioLogLogitFullChoiceSet(1:(beta1(0) * Variable1), 2:(beta2(0) * Variable2), 3:`0`)}
simpleExample: database [test]{'loglike': _bioLogLogitFullChoiceSet(1:(beta1(0) * Variable1), 2:(beta2(0) * Variable2), 3:`0`)}

scipy

The default optimization algorithm is from scipy. It is possible to transfer parameters to the algorithm. We include here an irrelevant parameter to illustrate the warning.

In [41]:
results = myBiogeme.estimate(algorithm=opt.scipy, 
                             algoParameters={'myparam':3})
results.getEstimatedParameters()
/Users/michelbierlaire/opt/anaconda3/envs/python38/lib/python3.8/site-packages/biogeme-3.2.6-py3.8-macosx-10.9-x86_64.egg/biogeme/optimization.py:518: OptimizeWarning: Unknown solver options: myparam
  results = sc.minimize(f_and_grad, initBetas, bounds=bounds, jac=True, options=opts)
Out[41]:
Value Std err t-test p-value Rob. Std err Rob. t-test Rob. p-value
beta1 0.144546 0.391017 0.369666 0.711632 0.366198 0.394720 0.693049
beta2 0.023502 0.036947 0.636086 0.524720 0.034280 0.685574 0.492982
In [42]:
for k, v in results.data.optimizationMessages.items():
    print(f'{k}:\t{v}')
Algorithm:	scipy.optimize
Cause of termination:	b'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'
Number of iterations:	13
Number of function evaluations:	16
Optimization time:	0:00:00.012512

Newton with linesearch

In [43]:
results = myBiogeme.estimate(algorithm=opt.newtonLineSearchForBiogeme)
results.getEstimatedParameters()
Out[43]:
Value Std err t-test p-value Rob. Std err Rob. t-test Rob. p-value
beta1 0.144545 0.391017 0.369665 0.711632 0.366198 0.394720 0.693050
beta2 0.023502 0.036947 0.636086 0.524720 0.034280 0.685573 0.492982
In [44]:
for k, v in results.data.optimizationMessages.items():
    print(f'{k}:\t{v}')
Algorithm:	Unconstrained Newton with line search
Relative gradient:	3.2893676167559224e-07
Number of iterations:	3
Number of function evaluations:	10
Number of gradient evaluations:	10
Number of hessian evaluations:	4
Cause of termination:	Relative gradient = 3.3e-07 <= 6.1e-06
Optimization time:	0:00:00.010258

Changing the requested precision

In [45]:
results = myBiogeme.\
estimate(algorithm=opt.newtonLineSearchForBiogeme,
         algoParameters={'tolerance': 0.1})
results.getEstimatedParameters()
Out[45]:
Value Std err t-test p-value Rob. Std err Rob. t-test Rob. p-value
beta1 0.144317 0.390742 0.369340 0.711874 0.365691 0.394641 0.693108
beta2 0.023428 0.036930 0.634377 0.525835 0.034256 0.683887 0.494047
In [46]:
for k, v in results.data.optimizationMessages.items():
    print(f'{k}:\t{v}')
Algorithm:	Unconstrained Newton with line search
Relative gradient:	0.014746524905599795
Number of iterations:	2
Number of function evaluations:	7
Number of gradient evaluations:	7
Number of hessian evaluations:	3
Cause of termination:	Relative gradient = 0.015 <= 0.1
Optimization time:	0:00:00.005581

CFSQP algorithm

In [47]:
results = myBiogeme.estimate(algorithm=None)
results.getEstimatedParameters()
Out[47]:
Value Std err t-test p-value Rob. Std err Rob. t-test Rob. p-value
beta1 0.144544 0.391018 0.369661 0.711635 0.366199 0.394714 0.693054
beta2 0.023502 0.036947 0.636090 0.524718 0.034280 0.685578 0.492979
In [48]:
for k, v in results.data.optimizationMessages.items():
    print(f'{k}:\t{v}')
Algorithm:	CFSQP
Number of iterations:	7
Number fo function evaluations:	20
Cause of termination:	b'Normal termination. Obj: 6.05545e-06 Const: 6.05545e-06'
Optimization time:	0:00:00.005572

Newton with trust region

In [49]:
results = myBiogeme.estimate(algorithm=opt.newtonTrustRegionForBiogeme)
results.getEstimatedParameters()
Out[49]:
Value Std err t-test p-value Rob. Std err Rob. t-test Rob. p-value
beta1 0.144545 0.391017 0.369665 0.711632 0.366198 0.394720 0.693050
beta2 0.023502 0.036947 0.636086 0.524720 0.034280 0.685573 0.492982
In [50]:
for k, v in results.data.optimizationMessages.items():
    print(f'{k}:\t{v}')
Algorithm:	Unconstrained Newton with trust region
Relative gradient:	3.289367618372832e-07
Cause of termination:	Relative gradient = 3.3e-07 <= 6.1e-06
Number of iterations:	3
Number of function evaluations:	6
Number of gradient evaluations:	4
Number of hessian evaluations:	4
Optimization time:	0:00:00.004317

We illustrate the parameters. We use the truncated conjugate gradient instead of dogleg for the trust region subproblem, starting with a small trust region of radius 0.001, and a maximum of 3 iterations.

In [51]:
results = myBiogeme.\
estimate(algorithm=opt.newtonTrustRegionForBiogeme,
         algoParameters={'dogleg':False,
                         'radius':0.001,
                         'maxiter':3})
results.getEstimatedParameters()
Out[51]:
Value Std err t-test p-value Rob. Std err Rob. t-test Rob. p-value
beta1 0.000053 0.348093 0.000151 0.99988 0.308910 0.000170 0.999864
beta2 0.007000 0.032583 0.214830 0.82990 0.030019 0.233173 0.815627
In [52]:
for k, v in results.data.optimizationMessages.items():
    print(f'{k}:\t{v}')
Algorithm:	Unconstrained Newton with trust region
Relative gradient:	2.0182963630997235
Cause of termination:	Maximum number of iterations reached: 3
Number of iterations:	3
Number of function evaluations:	6
Number of gradient evaluations:	4
Number of hessian evaluations:	4
Optimization time:	0:00:00.002953

Changing the requested precision

In [53]:
results = myBiogeme.\
estimate(algorithm=opt.newtonTrustRegionForBiogeme,
         algoParameters={'tolerance':0.1})
results.getEstimatedParameters()
Out[53]:
Value Std err t-test p-value Rob. Std err Rob. t-test Rob. p-value
beta1 0.144317 0.390742 0.369340 0.711874 0.365691 0.394641 0.693108
beta2 0.023428 0.036930 0.634377 0.525835 0.034256 0.683887 0.494047
In [54]:
for k, v in results.data.optimizationMessages.items():
    print(f'{k}:\t{v}')
Algorithm:	Unconstrained Newton with trust region
Relative gradient:	0.014746524905603029
Cause of termination:	Relative gradient = 0.015 <= 0.1
Number of iterations:	2
Number of function evaluations:	4
Number of gradient evaluations:	3
Number of hessian evaluations:	3
Optimization time:	0:00:00.002571
In [55]:
results = myBiogeme.estimate(algorithm=opt.bfgsLineSearchForBiogeme)
results.getEstimatedParameters()
Out[55]:
Value Std err t-test p-value Rob. Std err Rob. t-test Rob. p-value
beta1 0.144545 0.391017 0.369665 0.711632 0.366198 0.394720 0.693050
beta2 0.023502 0.036947 0.636086 0.524720 0.034280 0.685573 0.492982
In [56]:
for k, v in results.data.optimizationMessages.items():
    print(f'{k}:\t{v}')
Algorithm:	Inverse BFGS with line search
Relative gradient:	5.933646980468922e-07
Cause of termination:	Relative gradient = 5.9e-07 <= 6.1e-06
Number of iterations:	5
Number of function evaluations:	28
Number of gradient evaluations:	6
Optimization time:	0:00:00.007484

BFGS with trust region

In [57]:
results = myBiogeme.estimate(algorithm=opt.bfgsTrustRegionForBiogeme)
results.getEstimatedParameters()
Out[57]:
Value Std err t-test p-value Rob. Std err Rob. t-test Rob. p-value
beta1 0.144546 0.391018 0.369667 0.711631 0.366198 0.394721 0.693048
beta2 0.023502 0.036947 0.636087 0.524720 0.034280 0.685574 0.492982
In [58]:
for k, v in results.data.optimizationMessages.items():
    print(f'{k}:\t{v}')
Algorithm:	BFGS with trust region
Relative gradient:	5.039394505504404e-07
Cause of termination:	Relative gradient = 5e-07 <= 6.1e-06
Number of iterations:	14
Number of function evaluations:	23
Number of gradient evaluations:	9
Optimization time:	0:00:00.012510

Newton/BFGS with trust region for simple bounds

This is the default algorithm used by Biogeme. It is the implementation of the algorithm proposed by Conn et al. (1988).

In [59]:
results = myBiogeme.\
estimate(algorithm=opt.simpleBoundsNewtonAlgorithmForBiogeme)
results.getEstimatedParameters()
Out[59]:
Value Std err t-test p-value Rob. Std err Rob. t-test Rob. p-value
beta1 0.144547 0.391018 0.369669 0.711629 0.366198 0.394723 0.693047
beta2 0.023502 0.036947 0.636088 0.524719 0.034280 0.685574 0.492981
In [60]:
for k, v in results.data.optimizationMessages.items():
    print(f'{k}:\t{v}')
Algorithm:	BFGS with trust region for simple bound constraints
Proportion analytical hessian:	0.0%
Relative projected gradient:	1.3873770982165183e-06
Number of iterations:	14
Number of function evaluations:	31
Number of gradient evaluations:	9
Number of hessian evaluations:	0
Cause of termination:	Relative gradient = 1.4e-06 <= 6.1e-06
Optimization time:	0:00:00.018799

When the second derivatives are too computationally expensive to calculate, it is possible to avoid calculating them at each successful iteration. The parameter 'proportionAnalyticalHessian' allows to control that.

In [61]:
results = myBiogeme.\
estimate(algorithm=opt.simpleBoundsNewtonAlgorithmForBiogeme,
         algoParameters = {'proportionAnalyticalHessian': 0.5})
results.getEstimatedParameters()
Out[61]:
Value Std err t-test p-value Rob. Std err Rob. t-test Rob. p-value
beta1 0.144545 0.391017 0.369665 0.711632 0.366198 0.394720 0.693050
beta2 0.023502 0.036947 0.636086 0.524721 0.034280 0.685573 0.492982
In [62]:
for k, v in results.data.optimizationMessages.items():
    print(f'{k}:\t{v}')
Algorithm:	Hybrid Newton [50.0%] with trust region for simple bound constraints
Proportion analytical hessian:	60.0%
Relative projected gradient:	2.790944329115531e-06
Number of iterations:	4
Number of function evaluations:	13
Number of gradient evaluations:	5
Number of hessian evaluations:	3
Cause of termination:	Relative gradient = 2.8e-06 <= 6.1e-06
Optimization time:	0:00:00.006613

If the parameter is set to zero, the second derivatives are not used at all, and the algorithm relies only on the BFGS update.

In [63]:
results = myBiogeme.\
estimate(algorithm=opt.simpleBoundsNewtonAlgorithmForBiogeme,
         algoParameters = {'proportionAnalyticalHessian': 0})
results.getEstimatedParameters()
Out[63]:
Value Std err t-test p-value Rob. Std err Rob. t-test Rob. p-value
beta1 0.144547 0.391018 0.369669 0.711629 0.366198 0.394723 0.693047
beta2 0.023502 0.036947 0.636088 0.524719 0.034280 0.685574 0.492981
In [64]:
for k, v in results.data.optimizationMessages.items():
    print(f'{k}:\t{v}')
Algorithm:	BFGS with trust region for simple bound constraints
Proportion analytical hessian:	0.0%
Relative projected gradient:	1.3873770982165183e-06
Number of iterations:	14
Number of function evaluations:	31
Number of gradient evaluations:	9
Number of hessian evaluations:	0
Cause of termination:	Relative gradient = 1.4e-06 <= 6.1e-06
Optimization time:	0:00:00.019712