- Learn the difference between parametric statistics vs nonparametric statistics
- Be able to apply the following non-parametric hypothesis tests: Wilcoxon sum of ranks, Wilcoxon signed ranks, Spearman correlaction test
- Understand how to apply parametric hypothesis tests with discrete values

In [1]:

```
import random
import numpy as np
import matplotlib.pyplot as plt
from math import sqrt, pi, erf
import scipy.stats as ss
```

**Parametric Statistics:** What we've seen before, where we do statistics by assuming the data follows some underlying probability distribution (like normal distribution). Sometimes this is a good assumpetion because of the CLT.

**Nonparametric Statistics:** We do statistics without assuming an equation form for the underlying probability distribution. Typically harder to prove significance here because we have less information due to not assuming probability distribution.

Nonparametric statistics are secret and not widely taught because people belive they are challenging to understand. This is true, but I don't think undergraduates completely understand probability measure spaces but it doesn't stop us from using them.

*From here onwards, most tests will not assume normality and are nonparametric. You won't find these tests in most traditional statistics textbooks*

To do nonparametric statistics, one of the underlying principles is converting measurements into rankings.

In [2]:

```
d = np.random.rand(10)
print (d)
print (ss.rankdata(d))
```

**Data Type:** Ranks

**Compares:** Two sets of measurements.

**Null Hypothesis:** The two sets of measurements are from the same distribution

**Conditions:** Unmatched measurements. Unmatched the measurements aren't in pairs and you don't necessarily have the same number

**Related Test 1:** Wilcoxon's Signed Ranks Test for matched data measuring one thing (i.e, temperature)

**Related Test 2:** Spearman's Correlation Test for matched data measuring two things (i.e., temperature and pressure)

**Python:** `scipy.stats.ranksums`

**Hints:** Make sure all data is in the same units!

In [3]:

```
import pandas as pd
data = pd.read_csv('grades.csv')
#get some info:
data.info()
```

I'm going to standardize the homework so that they are all out of 100%. The first row contains perfect scores on each.

In [4]:

```
data /= data.iloc[0,:]
data *= 100
data.mean(axis=0)
```

Out[4]:

In [5]:

```
plt.title('HW1')
plt.hist(data.HW1)
plt.show()
```

In [6]:

```
plt.title('HW2')
plt.hist(data.HW2)
plt.show()
```

In [7]:

```
ss.ranksums(data['HW1'], data['HW2'])
```

Out[7]:

The $p$-value is 0.70, so we cannot rule out the null hypothesis that they are from the same distribution. What about a more recent difficult homework?

In [8]:

```
ss.ranksums(data['HW1'], data['HW5'])
```

Out[8]:

So HW 1 and HW 5 were significantly different.

**Data Type:** Ranks

**Compares:** Two sets of measurements

**Null Hypothesis:** The two sets of measurements are from the same distribution

**Conditions:** Measurements are matched. Matched means the data comes in tuples/pairs. More than 6 samples, better to have more than 20.

**Related Test 1:** Wilcoxon's Sum of Ranks Test for unmatched data measuring one thing (i.e, temperature)

**Related Test 2:** Spearman's Correlation Test for matched data measuring two things (i.e., temperature and pressure)

**Python:** `scipy.stats.wilcoxon`

**Hints:** Make sure all data is in the same units!

Since the same people are doing the HW each week, a more accuracte comparison would be to used the Signed Rank Test.

In [9]:

```
ss.wilcoxon(data.HW1, data.HW2)
```

Out[9]:

In [10]:

```
ss.wilcoxon(data.HW1, data.HW5)
```

Out[10]:

Notice that the p-values are lower relative to the unmatched sum of ranks test, meaning have paired data allows us to be more certain in our conclusions.

**Data Type:** Ranks

**Compares:** Two sets of measurements

**Null Hypothesis:** The two sets of measurements are uncorrelated

**Conditions:** Measurements are matched. Matched means the data comes in tuples/pairs. The measurements are of different things

**Related Test 1:** Wilcoxon's Sum of Ranks Test for unmatched data measuring one thing (i.e, temperature)

**Related Test 2:** Wilcoxon's Signed Ranks Test for matched data measuring one thing (i.e, temperature)

**Python:** `scipy.stats.spearmanr`

First, let's get the average grade on the homeworks. The spreadsheet has 6 homeworks

In [11]:

```
#build a list of all the HW indices
index = []
for i in range(1,7):
index.append('HW{}'.format(i))
#access those homeworks and then take the mean along the columns
hw_means = data[index].mean(axis=1)
```

In [12]:

```
plt.plot(hw_means, data.Midterm, 'o')
plt.show()
```

In [13]:

```
ss.spearmanr(hw_means, data.Midterm)
```

Out[13]:

Remarkable!

In [14]:

```
np.corrcoef(hw_means, data.Midterm)
```

Out[14]:

**Data Type:** Count

**Compares:** Count vs a poisson distributed population

**Null Hypothesis:** The number of observations (count) came from the known population

**Conditions:** Less than 40 samples (for computational simplicity)

**Related Test 1:** $zI$ test, for more than 40 samples

**Python:** Construct an interval and integrate using `scipy.stats.poisson.cdf(x, mu=...)`

**Hints**: Your interval should contain your value and all other extreme values. The interval should go up to infinity or down to 0 depending on if it's higher or lower than the expected value.

The number of hurricanes in 2005 was 15. The historic average is 6.3. Is this number signficantly different?

We will construct an interval containing all values as extreme as ours. We don't consider a low number of hurricanes to be extreme in this example. *Remember that we want to include the value into this interval.*

First consider only saying that lots of hurricanes is out of the ordininary (not part of the null hypothesis).

$$ P = P(x \geq 15) = 1 - \sum_0^{14} P(x) $$In [15]:

```
print('p-value is', (1 - ss.poisson.cdf(14, mu=6.3)))
```

So we reject the null hypothesis. This is a highly unusual number of hurricanes.