%matplotlib inline
import matplotlib.pyplot as plt
import pandas as pd
Finding documentation for python functions is easy:
?pd.read_csv
Now we read in the population frequency data:
dat = pd.read_csv("population_frequencies.txt", delim_whitespace=True, names=["nr", "pop"])
and verify that it worked:
dat
nr | pop | |
---|---|---|
0 | 9 | Abkhasian |
1 | 16 | Adygei |
2 | 6 | Albanian |
3 | 7 | Aleut |
4 | 4 | Aleut_Tlingit |
5 | 7 | Altaian |
6 | 10 | Ami |
7 | 10 | Armenian |
8 | 9 | Atayal |
9 | 10 | Balkar |
10 | 29 | Basque |
11 | 25 | BedouinA |
12 | 19 | BedouinB |
13 | 10 | Belarusian |
14 | 6 | BolshoyOleniOstrov |
15 | 9 | Borneo |
16 | 10 | Bulgarian |
17 | 8 | Cambodian |
18 | 2 | Canary_Islander |
19 | 2 | ChalmnyVarre |
20 | 9 | Chechen |
21 | 20 | Chukchi |
22 | 3 | Chukchi1 |
23 | 10 | Chuvash |
24 | 10 | Croatian |
25 | 8 | Cypriot |
26 | 10 | Czech |
27 | 10 | Dai |
28 | 9 | Daur |
29 | 4 | Dolgan |
... | ... | ... |
86 | 27 | Sardinian |
87 | 8 | Saudi |
88 | 4 | Scottish |
89 | 10 | Selkup |
90 | 10 | Semende |
91 | 10 | She |
92 | 2 | Sherpa.DG |
93 | 11 | Sicilian |
94 | 53 | Spanish |
95 | 5 | Spanish_North |
96 | 8 | Syrian |
97 | 8 | Tajik |
98 | 10 | Thai |
99 | 2 | Tibetan.DG |
100 | 10 | Tu |
101 | 22 | Tubalar |
102 | 10 | Tujia |
103 | 50 | Turkish |
104 | 7 | Turkmen |
105 | 10 | Tuvinian |
106 | 9 | Ukrainian |
107 | 25 | Ulchi |
108 | 10 | Uygur |
109 | 10 | Uzbek |
110 | 3 | WHG |
111 | 7 | Xibo |
112 | 20 | Yakut |
113 | 9 | Yamnaya_Samara |
114 | 10 | Yi |
115 | 19 | Yukagir |
116 rows × 2 columns
OK, so let's proceed with simple plotting:
plt.plot(dat["nr"])
[<matplotlib.lines.Line2D at 0x7f8e0c1af198>]
Not bad, but we'd like to sort the values. For that we use the sort_values
function:
?dat.sort_values
x = range(len(dat_sorted))
y = dat_sorted["nr"]
plt.plot(x, y)
[<matplotlib.lines.Line2D at 0x7f8e0bff15c0>]
Now we just need to add tick labels and change the size of the plot:
dat_sorted = dat.sort_values(by="nr")
y = dat_sorted["nr"]
x = range(len(y))
xticks = dat_sorted["pop"]
plt.figure(figsize=(20,8))
plt.plot(x, y)
plt.xticks(x, xticks, rotation="vertical");