In [1]:
import sympy as S
S.var("x mu sigma", real=True)

Out[1]:
$$\begin{pmatrix}x, & \mu, & \sigma\end{pmatrix}$$

Here's the likelihood for a 1-d normal distribution, in terms of two parameters: the mean $\mu$ and standard deviation $\sigma$

In [2]:
f = 1/(sqrt(2*S.pi)*sigma)*S.exp(-(x-mu)**2/(2*sigma**2))
f

Out[2]:
$$\frac{\sqrt{2}}{2 \sqrt{\pi} \sigma} e^{- \frac{\left(- \mu + x\right)^{2}}{2 \sigma^{2}}}$$

And here's the log likelihood, which we can tidy up:

In [3]:
ll = log(f)
ll

Out[3]:
$$\log{\left (\frac{\sqrt{2}}{2 \sqrt{\pi} \sigma} e^{- \frac{\left(- \mu + x\right)^{2}}{2 \sigma^{2}}} \right )}$$
In [4]:
ll = simplify(logcombine(ll))
ll

Out[4]:
$$\log{\left (\frac{1}{\sigma} \right )} - \frac{1}{2} \log{\left (\pi \right )} - \frac{1}{2} \log{\left (2 \right )} - \frac{\left(\mu - x\right)^{2}}{2 \sigma^{2}}$$

Let's pull out the derivatives of the log-likelihood: $\frac{\partial\mathcal(l)}{\partial\mu}$, $\frac{\partial\mathcal(l)}{\partial\sigma}$

In [5]:
ll.diff(mu)

Out[5]:
$$- \frac{1}{2 \sigma^{2}} \left(2 \mu - 2 x\right)$$
In [6]:
ll.diff(sigma)

Out[6]:
$$- \frac{1}{\sigma} + \frac{1}{\sigma^{3}} \left(\mu - x\right)^{2}$$

Second derivatives:

In [7]:
ll.diff(mu, mu)

Out[7]:
$$- \frac{1}{\sigma^{2}}$$
In [8]:
ll.diff(mu, sigma)

Out[8]:
$$\frac{1}{\sigma^{3}} \left(2 \mu - 2 x\right)$$
In [9]:
ll.diff(sigma, sigma)

Out[9]:
$$\frac{1}{\sigma^{2}} \left(1 - \frac{3}{\sigma^{2}} \left(\mu - x\right)^{2}\right)$$
In [10]:
simplify(_)

Out[10]:
$$\frac{1}{\sigma^{4}} \left(\sigma^{2} - 3 \left(\mu - x\right)^{2}\right)$$

Let's find the point where the gradient is zero for $\mu$ and $\sigma$

In [11]:
solve(f.diff(mu), mu)

Out[11]:
$$\begin{bmatrix}x\end{bmatrix}$$
In [12]:
solve(f.diff(sigma), sigma)

Out[12]:
$$\begin{bmatrix}- \mu + x, & \mu - x\end{bmatrix}$$

What if we want to use these calculations in numerical optimization code?

In [13]:
d2f_sigma2 = lambdify((mu, sigma, x), f.diff(sigma, sigma))

In [14]:
d2f_sigma2(1.2, 2, 3)

Out[14]:
$$-0.0463620287292$$
In [15]:
timeit d2f_sigma2(1.2, 2, 3)

1000000 loops, best of 3: 1.61 µs per loop

In [17]:
from sympy.utilities import codegen

In [18]:
codegen.ccode(f)

Out[18]:
'(1.0L/2.0L)*sqrt(2)*exp(-1.0L/2.0L*pow(-mu + x, 2)/pow(sigma, 2))/(sqrt(M_PI)*sigma)'
In [19]:
 print codegen.codegen(('d2f_dsigma2', f), 'C', 'd2f_dsigma2')[0][1]

/******************************************************************************
*                     Code generated with sympy 0.7.4.1                      *
*                                                                            *
*                                                                            *
*                       This file is part of 'project'                       *
******************************************************************************/
#include "d2f_dsigma2.h"
#include <math.h>

double d2f_dsigma2(double mu, double sigma, double x) {

return (1.0L/2.0L)*sqrt(2)*exp(-1.0L/2.0L*pow(-mu + x, 2)/pow(sigma, 2))/(sqrt(M_PI)*sigma);

}



Too lazy to compile a C extension. Do it for me!

In [20]:
from sympy.utilities import autowrap

In [21]:
fw = autowrap.autowrap(f.diff(sigma, sigma))

In [22]:
timeit fw(1.2, 2, 3)

1000000 loops, best of 3: 214 ns per loop

In [24]:
logpdf = autowrap.autowrap(ll)

In [27]:
logpdf(2.3, 6., 1)

Out[27]:
$$-2.73417022465$$
In [40]:
from scipy.stats.distributions import norm
s_logpdf = norm.logpdf

In [41]:
s_logpdf(1, 2.3, 6.)

Out[41]:
$$-2.73417022465$$
In [42]:
%timeit logpdf(2.3, 6., 1)

1000000 loops, best of 3: 197 ns per loop

In [43]:
%timeit norm.logpdf(2.3, 6., 1)

10000 loops, best of 3: 86 µs per loop

In [45]:
x = np.arange(100)
vlogpdf = np.vectorize(logpdf)

In [47]:
timeit vlogpdf(2.3, 6, x)

10000 loops, best of 3: 51.8 µs per loop

In [48]:
timeit s_logpdf(2.3, 6., x)

10000 loops, best of 3: 108 µs per loop