Basic statistics

In [1]:
import numpy

We can use Python as simple interactive calculator:

In [2]:
2 + 3 + 4
Out[2]:
9

Here we call the sqrt function from the numpy library.

In [3]:
numpy.sqrt(2 + 2)
Out[3]:
2.0

Some useful constants are predefined.

In [4]:
numpy.pi
Out[4]:
3.141592653589793
In [5]:
numpy.sin(numpy.pi)
Out[5]:
1.2246467991473532e-16

The notation e-16 above means $10^{-16}$; the number above is very very small.

We can generate a random number from a uniform distribution between 20 and 30. If you evaluate this several times (press Shift-Enter or press on the triangular Run Cell button in the toolbar above), it will generate a different random number each time.

In [6]:
numpy.random.uniform(20, 30)
Out[6]:
25.071324057917977
In [7]:
numpy.random.uniform(20, 30)
Out[7]:
21.94971427707708

We can generate an array of random numbers by passing a third argument to the numpy.random.uniform function, saying how many random numbers we want. We store the array in a variable named obs.

In [8]:
obs = numpy.random.uniform(20, 30, 10)
obs
Out[8]:
array([ 22.93680308,  26.63962388,  23.48878262,  29.81223723,
        25.48617268,  21.50435359,  29.87250384,  23.46127349,
        23.69304149,  27.59178761])

The builtin function len in Python tells us the length of an array or a list.

In [9]:
len(obs)
Out[9]:
10

We can do arithmetic on arrays, adding them together of subtracting a constant from each element.

In [10]:
obs + obs
Out[10]:
array([ 45.87360616,  53.27924776,  46.97756525,  59.62447446,
        50.97234535,  43.00870717,  59.74500769,  46.92254699,
        47.38608298,  55.18357523])
In [11]:
obs - 25
Out[11]:
array([-2.06319692,  1.63962388, -1.51121738,  4.81223723,  0.48617268,
       -3.49564641,  4.87250384, -1.53872651, -1.30695851,  2.59178761])

The array has methods, a kind of function that acts on the array.

In [12]:
obs.mean()
Out[12]:
25.448657951640019
In [13]:
obs.sum()
Out[13]:
254.48657951640018
In [14]:
obs.min()
Out[14]:
21.504353586726896

Simple plotting

The matplotlib library allows you to generate many types of plots and statistical graphs in a convenient way. The online gallery shows the variety of plots available, and the documentation is also available online. We import the pyplot component of matplotlib and give it an alias plt. We also ask the IPython interface to show us plots inline, directly within this notebook.

In [15]:
import matplotlib.pyplot as plt
%matplotlib inline
%config InlineBackend.figure_formats=['svg']
In [16]:
X = numpy.random.uniform(20, 30, 10)
Y = numpy.random.uniform(50, 100, 10)
plt.scatter(X, Y);
In [17]:
x = numpy.linspace(-2, 10, 100)
plt.plot(x, numpy.sin(x));
In [18]:
x = numpy.linspace(0, 10, 100)
obs = numpy.sin(x) + numpy.random.uniform(-0.1, 0.1, 100)
plt.plot(x, obs);
In [ ]: