1. Load data and preliminary checks¶

In [1]:

# load libraries
import os, sys
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

%matplotlib inline

Now import the data in various format for analysis. First check the folder for data. Commands starting with ! indicates they are system commmands.

In [2]:

!dir

 Volume in drive E is New Volume
 Volume Serial Number is A2BF-3D1C

 Directory of E:\Yichuan\Climate_vulnerability_wh\jupyter_workspace

11/07/2016  16:17    <DIR>          .
11/07/2016  16:17    <DIR>          ..
26/05/2016  10:00               854 .gitignore
26/05/2016  10:04    <DIR>          .ipynb_checkpoints
16/06/2016  15:06           286,727 agg_amp.csv
16/06/2016  15:06           456,560 agg_bird.csv
16/06/2016  15:06            68,377 agg_coral.csv
10/02/2016  11:06         8,536,618 amp.xlsx
10/02/2016  11:06        10,999,911 bird.xlsx
10/02/2016  11:06         4,465,794 coral.xlsx
26/05/2016  09:57            35,815 LICENSE
26/05/2016  10:03               290 README.md
25/05/2016  12:33            14,251 region.csv
11/07/2016  16:17             4,613 report.ipynb
10/05/2016  16:44        54,522,213 result_final.csv
03/03/2016  13:30        40,785,576 rl_attr.csv
15/12/2015  17:54       154,368,653 sis.csv
16/06/2016  15:06         2,036,150 wh_amp.csv
16/06/2016  15:06        22,931,649 wh_bird.csv
16/06/2016  15:06         3,343,584 wh_coral.csv
16/06/2016  17:51         2,431,705 workspace.ipynb
              18 File(s)    305,289,340 bytes
               3 Dir(s)  788,429,307,904 bytes free

In [3]:

# CCV analysis results
amp = pd.read_excel('amp.xlsx')
bird = pd.read_excel('bird.xlsx')
coral = pd.read_excel('coral.xlsx')

In [4]:

# check data for different structure
# the intersection result processed in arcmap, essentially: wdpaid, id_no, per_overlap, + sp attr and wh attr
# Essentially, POS invalid rows removed, duplicated dissolved and percentage overlap calculated
# for more information, consult Evernote notes
result = pd.read_csv('result_final.csv', encoding='latin1') 
sis = pd.read_csv('sis.csv', sep='|')
rl = pd.read_csv('rl_attr.csv')

In [5]:

# no duplicates for bird and amp
(amp.Fullname.size == amp.index.size) and (bird.Fullname.size == bird.index.size)

Out[5]:

True

In [6]:

# coral data has duplicates
print('unique coral names:', coral.Fullname.unique().size)
print('total rows:', coral.index.size)

unique coral names: 797
total rows: 1594

In [7]:

# get only the half
coral_unique = coral.groupby('Fullname').first().reset_index()
coral_unique.index.size

Out[7]:

In [8]:

result.per.hist(bins=50)

Out[8]:

<matplotlib.axes._subplots.AxesSubplot at 0x111137b8>

Note: areakm2_x is the actual overlap between pairs of PA and species, while areakm2_y refers to the size of WH site.

In [9]:

result.head(5)

Out[9]:

	Unnamed: 0	wdpaid	id_no	areakm2_x	areakm2_y	per	en_name	fr_name	status_yr	rep_area	...	phylum_name	class_name	order_name	family_name	genus_name	species_name	category	biome_marine	biome_freshwater	biome_terrestrial
0	0	191	2057.0	106497.703640	146962.025607	0.724661	Galápagos Islands	Îles Galápagos	1978	140665.14	...	CHORDATA	MAMMALIA	CARNIVORA	OTARIIDAE	Arctocephalus	galapagoensis	EN	t	f	t
1	1	191	2474.0	138918.354698	146962.025607	0.945267	Galápagos Islands	Îles Galápagos	1978	140665.14	...	CHORDATA	MAMMALIA	CETARTIODACTYLA	BALAENOPTERIDAE	Balaenoptera	acutorostrata	LC	t	f	f
2	2	2012	2474.0	2129.930772	5853.472599	0.363875	Everglades National Park	Parc national des Everglades\r\r\r\n	1979	5929.20	...	CHORDATA	MAMMALIA	CETARTIODACTYLA	BALAENOPTERIDAE	Balaenoptera	acutorostrata	LC	t	f	f
3	3	2018	2474.0	2549.486762	97284.250341	0.026207	Kluane / Wrangell-St Elias / Glacier Bay / Tat...	Kluane / Wrangell-St Elias / Glacier Bay / Tat...	1979	98391.21	...	CHORDATA	MAMMALIA	CETARTIODACTYLA	BALAENOPTERIDAE	Balaenoptera	acutorostrata	LC	t	f	f
4	4	2554	2474.0	4.105942	5502.386948	0.000746	Darien National Park	Parc national du Darien	1981	5970.00	...	CHORDATA	MAMMALIA	CETARTIODACTYLA	BALAENOPTERIDAE	Balaenoptera	acutorostrata	LC	t	f	f

5 rows × 26 columns

In [10]:

result.dtypes

Out[10]:

Unnamed: 0             int64
wdpaid                 int64
id_no                float64
areakm2_x            float64
areakm2_y            float64
per                  float64
en_name               object
fr_name               object
status_yr              int64
rep_area             float64
gis_area             float64
country               object
crit                  object
areakm2              float64
binomial              object
kingdom_name          object
phylum_name           object
class_name            object
order_name            object
family_name           object
genus_name            object
species_name          object
category              object
biome_marine          object
biome_freshwater      object
biome_terrestrial     object
dtype: object

2. Data preprocessing and exploration¶

The climate vulnerability analysis was done back using an older version of Red List 2009 (methodology) and the current species range distribution data is the latest 2015-4 version. Therefore it is likely the taxanomies may change. In some cases, changes may be simple for example renaming genus name or other higher taxanomy, while others may be extremely convoluted, involving spliting of one previously recognised species into multiple ones or vice versa merging previously different species. It is very difficult to reconcile such differences.

Here, I explore such mismatches between birds, amphibians and warm water reef-building corals.

In [11]:

print('Total birds in SIS,', (sis['class'] == 'AVES').sum())
# birds SIS
sis_bird = sis[sis['class']=='AVES']

Total birds in SIS, 10424

In [12]:

print('unique birds in CCV analysis:', bird.Fullname.unique().size)
print('unique birds in SIS:', sis_bird.friendly_name.unique().size)

unique birds in CCV analysis: 9856
unique birds in SIS: 10424

Differences between bird numbers, and similarly other taxa groups

In [13]:

sis_amp = sis[sis['class']=='AMPHIBIA']

print('Birds in CCV but not in SIS:', np.setdiff1d(bird.Fullname.unique(), sis_bird.friendly_name.unique()).size)
print('Total birds in CCV:', bird.index.size)
print('Amphibians in CCV but not in SIS:', np.setdiff1d(amp.Fullname.unique(), sis_amp.friendly_name.unique()).size)
print('Total amphibians in CCV:', amp.index.size)
print('Corals in CCV but not in SIS:', np.setdiff1d(coral_unique.Fullname.unique(), sis.friendly_name.unique()).size)
print('Total corals in CCV:', coral_unique.index.size)

Birds in CCV but not in SIS: 673
Total birds in CCV: 9856
Amphibians in CCV but not in SIS: 502
Total amphibians in CCV: 6204
Corals in CCV but not in SIS: 19
Total corals in CCV: 797

Given the relatively low number and proportion of such missing species due to taxonomic changes, it is unlikely the result may change significantly. For the purpose of deriving a globally consistent picture, we simply exclude these species.

Note the datatype in the result table is somehow different (int vs float) and it will impact comparisons. This needs to be explicited addressed, by rounding.

In [14]:

result.ix[176646].id_no

Out[14]:

61623293.999999993

result.id_no.astype('int64') will not work as it simply gets rid of its fraction. Instead round(0) method should be used

In [15]:

result['id_no_int'] = result.id_no.round(0)

The effect can be easily illustrated. The difference has reduced from 1581 to 36.

In [16]:

(result.id_no.isin(sis.taxonid)).sum()

Out[16]:

In [17]:

dif_spatial = np.setdiff1d(result.id_no, sis.taxonid)
dif_spatial.size

Out[17]:

In [18]:

dif_spatial = np.setdiff1d(result.id_no_int, sis.taxonid)
dif_spatial.size

Out[18]:

Let's take a look at the differece, i.e., IDS in the intersection result but not in the SIS.

In [19]:

dif_sis = result[result.id_no.isin(dif_spatial)][['id_no', 'class_name', 'binomial']]

In [20]:

dif_sis['class_name'].unique()

Out[20]:

array(['ACTINOPTERYGII', 'MAGNOLIOPSIDA', 'LILIOPSIDA', 'GASTROPODA',
       'MAMMALIA', 'SARCOPTERYGII', 'INSECTA', 'BIVALVIA'], dtype=object)

Now it is clear that the odd missing species should not impact our analysis, which only looks at birds, amphibians and corals. In other words, the IDs in the result table having corresponding SIS entries. Given that the result table contains already all SIS information, there is no need to join the SIS table. What needs to be address now is to compare the result table directly with the climate vulnerability analysis.

In [21]:

dif_amp_result = np.setdiff1d(amp.Fullname, result.binomial)
dif_amp_sis = np.setdiff1d(amp.Fullname, sis.friendly_name)
dif_amp_result.size, dif_amp_sis.size

Out[21]:

(4174, 502)

In [22]:

# get amp_result using CCV, SIS and partialoverlap_result
amp_result = pd.merge(amp, sis_amp, left_on = 'Fullname', right_on='friendly_name')
amp_result = pd.merge(amp_result, result, left_on='taxonid', right_on='id_no_int')

In [23]:

amp_result.index

Out[23]:

Int64Index([   0,    1,    2,    3,    4,    5,    6,    7,    8,    9,
            ...
            4746, 4747, 4748, 4749, 4750, 4751, 4752, 4753, 4754, 4755],
           dtype='int64', length=4756)

In [24]:

amp.index

Out[24]:

RangeIndex(start=0, stop=6204, step=1)

In [25]:

# what happens if a join is made directly without going through SIS and ID_NO
test_amp = pd.merge(amp, result, left_on='Fullname', right_on='binomial')

In [26]:

test_amp.index

Out[26]:

Int64Index([   0,    1,    2,    3,    4,    5,    6,    7,    8,    9,
            ...
            4762, 4763, 4764, 4765, 4766, 4767, 4768, 4769, 4770, 4771],
           dtype='int64', length=4772)

In [27]:

# the difference
test_dif = np.setdiff1d(test_amp.id_no_int.unique(), amp_result.id_no_int.unique())
test_dif.size

Out[27]:

There might be something peculiar happening - as in theory the number should match. The fact there are 11 mismatching names suggest there is a possibility that the name in SIS still mismatches. See below the missing species using the CCV-SIS join.

In [28]:

test_amp[test_amp.id_no_int.isin(test_dif)]

Out[28]:

	SVP_ID	SIS_GAA_ID	GAA Family	Genus	Species	Fullname	Threatened	SUSC_A_Habitats	SUSC_A_aquatic larvae	SUSC_B_Temperature Range	...	class_name	order_name	family_name	genus_name	species_name	category	biome_marine	biome_freshwater	biome_terrestrial	id_no_int
398	3260	9380	MICROHYLIDAE	Anodonthyla	boulengerii	Anodonthyla boulengerii	NaN	L	L	L	...	AMPHIBIA	ANURA	MICROHYLIDAE	Anodonthyla	boulengeri	LC	f	f	t	57674.0
797	2771	32142	MICROHYLIDAE	Chiasmocleis	panamensis	Chiasmocleis panamensis	NaN	L	H	H	...	AMPHIBIA	ANURA	MICROHYLIDAE	Elachistocleis	panamensis	LC	f	t	t	57761.0
798	2771	32142	MICROHYLIDAE	Chiasmocleis	panamensis	Chiasmocleis panamensis	NaN	L	H	H	...	AMPHIBIA	ANURA	MICROHYLIDAE	Elachistocleis	panamensis	LC	f	t	t	57761.0
978	4258	51160	SALAMANDRIDAE	Cynops	cyanurus	Cynops cyanurus	NaN	L	L	L	...	AMPHIBIA	CAUDATA	SALAMANDRIDAE	Hypselotriton	cyanurus	LC	f	t	t	59440.0
979	4258	51160	SALAMANDRIDAE	Cynops	cyanurus	Cynops cyanurus	NaN	L	L	L	...	AMPHIBIA	CAUDATA	SALAMANDRIDAE	Hypselotriton	cyanurus	LC	f	t	t	59440.0
980	3529	51161	SALAMANDRIDAE	Cynops	orientalis	Cynops orientalis	NaN	L	L	L	...	AMPHIBIA	CAUDATA	SALAMANDRIDAE	Hypselotriton	orientalis	LC	f	t	t	59442.0
981	3529	51161	SALAMANDRIDAE	Cynops	orientalis	Cynops orientalis	NaN	L	L	L	...	AMPHIBIA	CAUDATA	SALAMANDRIDAE	Hypselotriton	orientalis	LC	f	t	t	59442.0
982	3529	51161	SALAMANDRIDAE	Cynops	orientalis	Cynops orientalis	NaN	L	L	L	...	AMPHIBIA	CAUDATA	SALAMANDRIDAE	Hypselotriton	orientalis	LC	f	t	t	59442.0
983	3529	51161	SALAMANDRIDAE	Cynops	orientalis	Cynops orientalis	NaN	L	L	L	...	AMPHIBIA	CAUDATA	SALAMANDRIDAE	Hypselotriton	orientalis	LC	f	t	t	59442.0
1306	2789	20162	MICROHYLIDAE	Gastrophryne	pictiventris	Gastrophryne pictiventris	NaN	L	L	H	...	AMPHIBIA	ANURA	MICROHYLIDAE	Hypopachus	pictiventris	LC	f	t	t	57816.0
1307	2790	19796	MICROHYLIDAE	Gastrophryne	usta	Gastrophryne usta	NaN	L	L	L	...	AMPHIBIA	ANURA	MICROHYLIDAE	Hypopachus	ustus	LC	f	t	t	57817.0
1393	3897	100217	HELEOPHRYNIDAE	Heleophryne	natalensis	Heleophryne natalensis	NaN	L	L	L	...	AMPHIBIA	ANURA	HELEOPHRYNIDAE	Hadromophryne	natalensis	LC	f	t	t	55273.0
1751	2792	29294	MICROHYLIDAE	Hyophryne	histrio	Hyophryne histrio	NaN	L	L	H	...	AMPHIBIA	ANURA	MICROHYLIDAE	Stereocyclops	histrio	DD	f	t	t	10634.0
2093	2137	29203	BRACHYCEPHALIDAE	Ischnocnema	paulodutrai	Ischnocnema paulodutrai	NaN	H	L	L	...	AMPHIBIA	ANURA	CRAUGASTORIDAE	Pristimantis	paulodutrai	LC	f	f	t	56835.0
4522	5278	9562	MICROHYLIDAE	Stumpffia	grandis	Stumpffia grandis	NaN	L	L	L	...	AMPHIBIA	ANURA	MICROHYLIDAE	Rhombophryne	grandis	DD	f	f	t	58009.0
4523	5282	9566	MICROHYLIDAE	Stumpffia	tridactyla	Stumpffia tridactyla	NaN	L	L	L	...	AMPHIBIA	ANURA	MICROHYLIDAE	Rhombophryne	tridactyla	DD	f	f	t	58015.0

16 rows × 58 columns

As a result, use direct join between RL intersection and CCV.

In [29]:

## remove test variables 
del amp_result, test_amp

3. Analysis¶

There are several tasks outstanding:

first of all, a reasonable and defendable cut-off percentage overlap value needs to be calculated as to determine what constitutes species found within World Heritage sites.
once the value is decided, the wdpaid, id_no look up table is then filtered, and it can be further refined by variables such as IUCN Red List category, CCV traits etc.
because the first question needs answering before further steps, and it may change. It is advisable to create a function to facilitate such process. i.e. a function such as:

def f(input_result_table, cut-off-value):
    new_table= input_result_table[input_result_table.per >= cut-off-value]
    # function to produces outputs based on the result table
    result = process_result(new_table)
    return result

3.1 The cut-off value¶

The EOO (extent of occurence) nature of the Red List species distribution polygon, an overlap between a species and a WH site does not necessarily suggest that a species is present. This is made worse if only part of the distribution polygon intersects with the WH site - this could be a digitisation error or simply inaccuracy due to a scale mismatch. Such 'species within WH sites' as a result must be removed. Quantiles of can be a useful way to look at.

In [30]:

# what happens if a join is made directly without going through SIS and ID_NO
result_amp = pd.merge(amp, result, left_on='Fullname', right_on='binomial')
result_bird = pd.merge(bird, result, left_on='Fullname', right_on='binomial')
result_coral = pd.merge(coral_unique, result, left_on='Fullname', right_on='binomial')

In [31]:

all_per_abc = pd.concat([result_amp.per, result_bird.per, result_coral.per])
all_per_abc.hist(bins=50)

Out[31]:

<matplotlib.axes._subplots.AxesSubplot at 0xc051dd8>

The large majority of overlaps are fairely big, suggesting sites WH are generally entirely covered by species. This is not surprising as most species distributions polygons are a few magnitudes bigger than WH sites. Yet, it is possible that range-restricted and/or threatened species may have much smaller range.

In [32]:

all_per_abc.quantile(0.5)

Out[32]:

0.9582745841568836

For illustration purposes, note the difference of choosing a cut-off value between 10% and 15% is not signicant (~2%, or ~2000 pa-species pair less). This should have minimal effects on the result.

In [33]:

(all_per_abc>0.15).sum(), (all_per_abc>0.1).sum(), (all_per_abc>0.05).sum()

Out[33]:

(56508, 58029, 60301)

In [34]:

((all_per_abc>0.1).sum() - (all_per_abc>0.15).sum())/all_per_abc.index.size

Out[34]:

0.022358440642088553

However, if we look at amphibians, birds and mammals separately, it is easily noticeable that corals have significant number of low percentage overlaps. This is because marine World Heritage sites tend to be rather extensive and huge compared to coral distributions which are coastal. Thus it is important to examine the absolute overlap as well for corals.

In [35]:

result_amp.per.hist(bins=50)

Out[35]:

<matplotlib.axes._subplots.AxesSubplot at 0xc051f28>

In [36]:

result_bird.per.hist(bins=50)

Out[36]:

<matplotlib.axes._subplots.AxesSubplot at 0xb7b1d68>

In [37]:

result_coral.per.hist(bins=50)

Out[37]:

<matplotlib.axes._subplots.AxesSubplot at 0xd7be4a8>

In [38]:

result_amp.areakm2_x.hist(bins =50, log=True)

Out[38]:

<matplotlib.axes._subplots.AxesSubplot at 0x10ca4f60>

In [39]:

result_bird.areakm2_x.hist(bins =50, log=True)

Out[39]:

<matplotlib.axes._subplots.AxesSubplot at 0x1d1343c8>

In [40]:

result_coral.areakm2_x.hist(bins =50, log=True)

Out[40]:

<matplotlib.axes._subplots.AxesSubplot at 0x1d267f60>

The histograms show all three taxa have a bimodal distribution, with significant number of values near two poles (very large overlaps or very small ones)

In [41]:

result_coral.sort_values('per', ascending=True)[['en_name', 'binomial', 'areakm2_x', 'per']].head(20)

Out[41]:

	en_name	binomial	areakm2_x	per
5363	Darien National Park	Millepora intricata	0.033701	0.000006
7471	Tropical Rainforest Heritage of Sumatra	Podabacia motuporensis	0.874951	0.000034
2550	Tropical Rainforest Heritage of Sumatra	Ctenactis albitentaculata	0.999503	0.000038
169	Tropical Rainforest Heritage of Sumatra	Acropora abrolhosensis	0.999503	0.000038
4692	Henderson Island	Leptastrea pruinosa	0.002691	0.000065
430	Henderson Island	Acropora cytherea	0.002691	0.000065
777	Henderson Island	Acropora hyacinthus	0.002691	0.000065
1577	Henderson Island	Acropora subulata	0.002691	0.000065
3214	Henderson Island	Favia matthaii	0.002691	0.000065
755	Henderson Island	Acropora humilis	0.002691	0.000065
6285	Henderson Island	Montipora venosa	0.002691	0.000065
7714	Henderson Island	Porites lobata	0.002691	0.000065
281	Henderson Island	Acropora austera	0.002691	0.000065
1347	Henderson Island	Acropora retusa	0.002691	0.000065
2389	Henderson Island	Caulastrea furcata	0.002691	0.000065
1362	Henderson Island	Acropora robusta	0.002691	0.000065
5587	Henderson Island	Montipora australiensis	0.002691	0.000065
4715	Henderson Island	Leptastrea purpurea	0.002691	0.000065
5133	Henderson Island	Lobophyllia hemprichii	0.002691	0.000065
7743	Henderson Island	Porites lutea	0.002691	0.000065

In [42]:

result_coral[['en_name', 'binomial', 'areakm2_x', 'per']][result_coral.en_name.isin(['Henderson Island'])].head()

Out[42]:

	en_name	binomial	areakm2_x	per
195	Henderson Island	Acropora aculeus	0.002691	0.000065
281	Henderson Island	Acropora austera	0.002691	0.000065
350	Henderson Island	Acropora cerealis	0.002691	0.000065
430	Henderson Island	Acropora cytherea	0.002691	0.000065
480	Henderson Island	Acropora digitifera	0.002691	0.000065

The threshold value can be reasonably justified by using both the perentage overlap and actual overlap. It is obvious if the entire WH site is covered by a species, then it should be counted. If the percentage overlap is too small, it could be either due to an inaccurate boundary, in which case, the species should not be counted; or a genunie overlap if the WH is considerably larger. Therefore, by adding an additional test to the absolute overlap value in $km^2$, ommissions due to small percentage overlap can be reduced.

Let's test the effect of using various emperical values of per and abskm2 to remove artifical overlaps caused by the inaccuracy of boundaries. This represents an optimistic estimate of the number of species in WH (reinforced by the nature of Red List EOO)

In [43]:

def test_params(per=None, abkm2=None):
    if per is None and abkm2 is None:
        result = None
    if per and abkm2: # both conditions are applied
        result = [(result_taxa[(result_taxa.per>per)|(result_taxa.areakm2_x>abkm2)].index.size, result_taxa.index.size) for \
 result_taxa in [result_amp, result_bird, result_coral]]
    elif per and abkm2 is None: # only per
        result = [(result_taxa[(result_taxa.per>per)].index.size, result_taxa.index.size) for \
 result_taxa in [result_amp, result_bird, result_coral]]
    elif per is None and abkm2:
        result = [(result_taxa[(result_taxa.areakm2_x>abkm2)].index.size, result_taxa.index.size) for \
 result_taxa in [result_amp, result_bird, result_coral]]
    else:
        return None
        
    print(per, abkm2, result)

# # check the number of rows after applying criteria
# for each in zip(['after', 'before'], *test_params(0.15, 1)):
#     print(each)

# per, abkm2, (after and before) for amp, bird and coral
print('---only per---')
test_params(0.05)
test_params(0.15)
test_params(0.25)
test_params(0.50)
print('---only abs km2---')
test_params(abkm2=1)
test_params(abkm2=5)
test_params(abkm2=10)
print('--- per or abs km2---')
test_params(0.15, 1)
test_params(0.15, 2)
test_params(0.25, 1)
test_params(0.25, 2)
test_params(0.25, 5)

---only per---
0.05 None [(4140, 4772), (50190, 54446), (5971, 8810)]
0.15 None [(3751, 4772), (47361, 54446), (5396, 8810)]
0.25 None [(3504, 4772), (45570, 54446), (5293, 8810)]
0.5 None [(3047, 4772), (41125, 54446), (3795, 8810)]
---only abs km2---
None 1 [(4689, 4772), (53712, 54446), (8445, 8810)]
None 5 [(4575, 4772), (52187, 54446), (7628, 8810)]
None 10 [(4523, 4772), (51560, 54446), (7326, 8810)]
--- per or abs km2---
0.15 1 [(4731, 4772), (54144, 54446), (8445, 8810)]
0.15 2 [(4711, 4772), (53852, 54446), (7694, 8810)]
0.25 1 [(4731, 4772), (54139, 54446), (8445, 8810)]
0.25 2 [(4701, 4772), (53704, 54446), (7694, 8810)]
0.25 5 [(4661, 4772), (53298, 54446), (7628, 8810)]

The result corroborates the distributions that percentage overlaps are sensitive and their change has a considerable impact on the number of filtered rows, while s small increase in absolute area in $km^2$ seems to have little effect. It is observed that an increase of per from 15% to 25% poses little change.

The threshold values chosen here may present a rather insigificant factor compared to the assumption made by the EOO intersection, i.e., even though a species overlaps 100% with a given WH, it may still be absent. However this approach remains a popular method in estimating global species conservation and offer a consistent view for comprehensively assessed species, in the absence of true area of occupancy (AOO) data.

I use 15% and 1km^2 as cut-off values

3.2 Climate vulnerability species inside WH sites¶

To be policy relavant at a site level, it's imperative that climate vulnerability information on individual World Heritage sites be obtained. This analysis illustrates how many amphibians, birds and coral species are affected by climate change in each World Heritage site, by using both the total number of climate vulnerable species and their proportion. Further aggretations can be used to reveal each component within sensitivity, low adaptability and exposure.

In [44]:

# get filtered result
result_amp_f, result_bird_f, result_coral_f = [result_taxa[(result_taxa.per>.15)|(result_taxa.areakm2_x>1)] for \
 result_taxa in [result_amp, result_bird, result_coral]]

# get species numbers
print('Filtered', 'Total-intersect', 'Total')
print(result_amp_f.Fullname.unique().size, result_amp.Fullname.unique().size, amp.Fullname.unique().size)
print(result_bird_f.Fullname.unique().size, result_bird.Fullname.unique().size, bird.Fullname.unique().size)
print(result_coral_f.Fullname.unique().size, result_coral.Fullname.unique().size, coral.Fullname.unique().size)

Filtered Total-intersect Total
2013 2030 6204
6914 6924 9856
724 727 797

Using a lambda function as an argument to get

total number of high vulnerability species per WH site
total number of species per WH site
percentage of the high vulnerability species per WH site

In [45]:

## agg_dict passed to do multiple aggregation at the same time
## x is a group of FINAL_SCOREs
agg_dict = {'FINAL_SCORE':\
            {'H_vul_per': lambda x: (x=='H').sum()/x.size, \
             'total_H_vul': lambda x: (x=='H').sum(),\
             'total_vul':len}}

## amp
amp_v = result_amp_f.groupby(['wdpaid', 'en_name']).agg(agg_dict).reset_index()
amp_v.columns = amp_v.columns.droplevel()

# birds
bird_v = result_bird_f.groupby(['wdpaid', 'en_name']).agg(agg_dict).reset_index()
bird_v.columns = bird_v.columns.droplevel()

# corals
coral_v = result_coral_f.groupby(['wdpaid', 'en_name']).agg(agg_dict).reset_index()
coral_v.columns = coral_v.columns.droplevel()

In [46]:

bird_v.head()

Out[46]:

			H_vul_per	total_vul	total_H_vul
0	191	Galápagos Islands	0.129412	85	11
1	197	Tikal National Park	0.179661	295	53
2	2004	Dinosaur Provincial Park	0.126866	134	17
3	2005	Nahanni National Park	0.141667	120	17
4	2006	Simien National Park	0.096886	289	28

It is necessary to add the missing labels for the two columns

In [47]:

# rename column names. One cannot do: amp_v.columns[1] = newcolumn_name
new_columns = amp_v.columns.values
new_columns[:2] = ['wdpaid', 'en_name']

amp_v.columns = new_columns
bird_v.columns = new_columns
coral_v.columns = new_columns

Graphs to show both the percentage (the number of species with FINAL SCORE =='H' against all species), and the total number of species that are highly vulnerable.

In [48]:

# percentage of H distribution
amp_v.H_vul_per.hist(bins=50)

Out[48]:

<matplotlib.axes._subplots.AxesSubplot at 0x1d0b0be0>

In [49]:

# total number of H distribution
amp_v.total_H_vul.hist(bins=50)

Out[49]:

<matplotlib.axes._subplots.AxesSubplot at 0x1e1715f8>

In [50]:

# percentage of H distribution
bird_v.H_vul_per.hist(bins=50)

Out[50]:

<matplotlib.axes._subplots.AxesSubplot at 0x10d58b38>

In [51]:

# total number of H distribution
bird_v.total_H_vul.hist(bins=50)

Out[51]:

<matplotlib.axes._subplots.AxesSubplot at 0x1ee95c50>

In [52]:

# percentage of H distribution
coral_v.H_vul_per.hist(bins=50)

Out[52]:

<matplotlib.axes._subplots.AxesSubplot at 0x1eec0cf8>

In [53]:

# total number of H distribution
coral_v.total_H_vul.hist(bins=50)

Out[53]:

<matplotlib.axes._subplots.AxesSubplot at 0x1dc93898>

In [54]:

print('Top WHs with high number of CV birds\n', bird_v.sort_values('total_H_vul', ascending=False).head(10))
print('==================')
print('Top WHs with high percentage of CV birds\n', bird_v.sort_values('H_vul_per', ascending=False).head(10))

Top WHs with high number of CV birds
      wdpaid                                            en_name  H_vul_per  \
92    61612                              Canaima National Park   0.366987   
72    17760                                 Manú National Park   0.266350   
149  220296                Central Amazon Conservation Complex   0.370307   
151  220298                    Central Suriname Nature Reserve   0.411157   
41     9614                               Sangay National Park   0.268730   
148  220295                  Noel Kempff Mercado National Park   0.227979   
52    10903  Talamanca Range-La Amistad Reserves / La Amist...   0.225750   
15     2554                               Darien National Park   0.240310   
90    61610                           Los Katíos National Park   0.241228   
82    26651                           Río Abiseo National Park   0.211240   

     total_vul  total_H_vul  
92         624          229  
72         841          224  
149        586          217  
151        484          199  
41         614          165  
148        579          132  
52         567          128  
15         516          124  
90         456          110  
82         516          109  
==================
Top WHs with high percentage of CV birds
         wdpaid                                            en_name  H_vul_per  \
120     145576                         Heard and McDonald Islands   0.512195   
177     902373                                 Ilulissat Icefjord   0.500000   
204  555512003                                   Putorana Plateau   0.472222   
198     903139                                            Surtsey   0.431373   
128     168239                  New Zealand Sub-Antarctic Islands   0.423077   
151     220298                    Central Suriname Nature Reserve   0.411157   
61       12207                Giant's Causeway and Causeway Coast   0.400000   
118     124388                                      Laponian Area   0.396947   
184     902489  West Norwegian Fjords  Geirangerfjord and Nær...   0.395522   
224  555577556                                       Stevns Klint   0.394161   

     total_vul  total_H_vul  
120         41           21  
177         36           18  
204         72           34  
198         51           22  
128         78           33  
151        484          199  
61         100           40  
118        131           52  
184        134           53  
224        137           54

In [55]:

print('Top WHs with high number of CV amphibians\n', amp_v.sort_values('total_H_vul', ascending=False).head(10))
print('==================')
print('Top WHs with high percentage of CV amphibian\n', amp_v.sort_values('H_vul_per', ascending=False).head(10))

Top WHs with high number of CV amphibians
      wdpaid                                            en_name  H_vul_per  \
49    10903  Talamanca Range-La Amistad Reserves / La Amist...   0.325926   
133  220298                    Central Suriname Nature Reserve   0.445783   
67    17760                                 Manú National Park   0.342593   
83    61612                              Canaima National Park   0.360465   
131  220296                Central Amazon Conservation Complex   0.318681   
152  902347                 Cape Floral Region Protected Areas   0.425532   
162  903062                      Rainforests of the Atsinanana   0.120253   
127  220292                                      Kinabalu Park   0.260870   
128  220293                          Gunung Mulu National Park   0.293103   
120  198296                    Area de Conservación Guanacaste   0.326923   

     total_vul  total_H_vul  
49         135           44  
133         83           37  
67         108           37  
83          86           31  
131         91           29  
152         47           20  
162        158           19  
127         69           18  
128         58           17  
120         52           17  
==================
Top WHs with high percentage of CV amphibian
         wdpaid                             en_name  H_vul_per  total_vul  \
153     902367              Pitons Management Area   1.000000          1   
181  555547992        Rock Islands Southern Lagoon   1.000000          1   
173  555512003                    Putorana Plateau   1.000000          1   
106     124387              Volcanoes of Kamchatka   1.000000          1   
110     145583    Morne Trois Pitons National Park   0.666667          3   
180  555547991            Lena Pillars Nature Park   0.500000          2   
148     900880                      Uvs Nuur Basin   0.500000          2   
96       68918      Whale Sanctuary of El Vizcaino   0.500000          2   
133     220298     Central Suriname Nature Reserve   0.445783         83   
152     902347  Cape Floral Region Protected Areas   0.425532         47   

     total_H_vul  
153            1  
181            1  
173            1  
106            1  
110            2  
180            1  
148            1  
96             1  
133           37  
152           20

In [56]:

amp_v.quantile(0.9)

Out[56]:

wdpaid         5.555120e+08
H_vul_per      3.320513e-01
total_vul      5.800000e+01
total_H_vul    4.800000e+00
dtype: float64

In [57]:

amp_v.quantile(0.1)

Out[57]:

wdpaid         2574.2
H_vul_per         0.0
total_vul         3.0
total_H_vul       0.0
dtype: float64

In [58]:

bird_v.quantile(0.9)

Out[58]:

wdpaid         5.555120e+08
H_vul_per      3.503704e-01
total_vul      4.516000e+02
total_H_vul    6.720000e+01
dtype: float64

In [59]:

bird_v.quantile(0.1)

Out[59]:

wdpaid         2577.800000
H_vul_per         0.058983
total_vul        56.600000
total_H_vul       9.000000
dtype: float64

For communications purposes, it may be convenient to have a single indicator/index to indicate whether a given WH site is considered climate vulnerable for some taxon (in this case amphibians). The questions then becomes how such a cut-off value can be reasonable calculated and can be meaningfully applied. One way is to, again, use quantile cut-off values for total number of highly vulnerable species and percentage overlap. However, any ranking created using quantile only compares within WH sites. Ideally, this should be derived using a global metric. Perhaps this is an important caveat. One needs to know about other taxa before making a decision as they should be consistent.

Questions:

make sense to rank? 10% sounds a good cut-off value?

Top10, top 50, top200

WH sites with high proportion seem to have a low total number of CV amphibians. Thus using proportion alone may not paint a full picture. It is conceivable that a combination of both number and propertion is needed to classify/rank WH sites.

3.4 Climate vulnerability species inside WH sites, new methods¶

While the above methods work well, similar stats for other columns need considerable manual coding. In this study, depending on the taxon of interest, there may be as high as 20 replications (a total of ~60 if all three taxa are conisidered). Therefore it is necessary to find a way through which such aggregations could be calculated simply and more scalable.

In [60]:

# get all fields and pick those we need for the stats
print(result_amp_f.columns)
print(result_bird_f.columns)
print(result_coral_f.columns)

Index(['SVP_ID', 'SIS_GAA_ID', 'GAA Family', 'Genus', 'Species', 'Fullname',
       'Threatened', 'SUSC_A_Habitats', 'SUSC_A_aquatic larvae',
       'SUSC_B_Temperature Range', 'SUSC_B_Precipitation Range',
       'SUSC_C_explosive breeder', 'SUSC_D_disease', 'SENSITIVITY',
       'ADAPT_A_barriers', 'ADAPT_A_dispersal_distance',
       'ADAPT_C_Slow_Gen_Turnpover', 'LOW_ADAPTABILITY', 'EXP_Sea Level',
       'EXP_MeanTemperature', 'EXP_MeanRainfall', 'EXP_AADTemperature',
       'EXP_AADRainfall', 'EXPOSURE', 'FINAL_SCORE', 'Final_Pessimistic',
       'Score_HorS', 'SensXLAdaptability', 'SensXExposure', 'LAdaptXExposure',
       'Threatened_and_Vulnerable', 'Unnamed: 0', 'wdpaid', 'id_no',
       'areakm2_x', 'areakm2_y', 'per', 'en_name', 'fr_name', 'status_yr',
       'rep_area', 'gis_area', 'country', 'crit', 'areakm2', 'binomial',
       'kingdom_name', 'phylum_name', 'class_name', 'order_name',
       'family_name', 'genus_name', 'species_name', 'category', 'biome_marine',
       'biome_freshwater', 'biome_terrestrial', 'id_no_int'],
      dtype='object')
Index(['SVP_ID', 'Fullname', 'Common name', 'New seq', 'SPcRecID',
       '2008 RDL Cat', 'Threatened', 'FINAL_SCORE', '__hab_specialisation',
       '__microhabitat', '__ForestDependence', '__TemperatureRange',
       '__PrecipRange', '__Species dependence', '__PopnSize',
       '__EffectivePopnSize', 'SENSITIVITY', '__Dispersal distance limited',
       '__Dispersal barriers', '__Low genetic diversity', '__clutch size',
       '__Gen_Length', 'LOW_ADAPTABILITY', '__EXP_Sea Inundation',
       '__EXP_MeanTemperature', '__EXP_AADTemperature', '__EXP_MeanPrecip',
       '__EXP_AADPrecip', 'EXPOSURE', 'Unnamed: 0', 'wdpaid', 'id_no',
       'areakm2_x', 'areakm2_y', 'per', 'en_name', 'fr_name', 'status_yr',
       'rep_area', 'gis_area', 'country', 'crit', 'areakm2', 'binomial',
       'kingdom_name', 'phylum_name', 'class_name', 'order_name',
       'family_name', 'genus_name', 'species_name', 'category', 'biome_marine',
       'biome_freshwater', 'biome_terrestrial', 'id_no_int'],
      dtype='object')
Index(['Fullname', 'SVProject_ID', 'genus', 'species', 'Threatened',
       'FINAL_SCORE', '_Few habitats', '_narrow depth rng',
       '_Larval vulnerabilityH', '_ShallowH', '_PastMortalityH', '_zooxH',
       '_Rare', 'SENSITIVITY', '_Short dispersal distance',
       '_dispersal barriers', '_ColonyMaxAge', '_SlowGrowthRate',
       'LOW_ADAPTABILITY', '_ProportionArag<3', '_MeanDHM', 'EXPOSURE',
       'Unnamed: 0', 'wdpaid', 'id_no', 'areakm2_x', 'areakm2_y', 'per',
       'en_name', 'fr_name', 'status_yr', 'rep_area', 'gis_area', 'country',
       'crit', 'areakm2', 'binomial', 'kingdom_name', 'phylum_name',
       'class_name', 'order_name', 'family_name', 'genus_name', 'species_name',
       'category', 'biome_marine', 'biome_freshwater', 'biome_terrestrial',
       'id_no_int'],
      dtype='object')

In [61]:

# Selected columns, based on the df.columns for each taxon
## amphibian
a_columns = ['SUSC_A_Habitats', 'SUSC_A_aquatic larvae',
 'SUSC_B_Temperature Range', 'SUSC_B_Precipitation Range',
 'SUSC_C_explosive breeder', 'SUSC_D_disease', 'SENSITIVITY',
 'ADAPT_A_barriers', 'ADAPT_A_dispersal_distance',
 'ADAPT_C_Slow_Gen_Turnpover', 'LOW_ADAPTABILITY', 
 'EXP_Sea Level',
 'EXP_MeanTemperature', 'EXP_MeanRainfall', 'EXP_AADTemperature',
'EXP_AADRainfall', 'EXPOSURE', 'FINAL_SCORE']

## birds
b_columns = ['FINAL_SCORE', '__hab_specialisation', 
'__microhabitat', '__ForestDependence', '__TemperatureRange',
'__PrecipRange', '__Species dependence', '__PopnSize',
'__EffectivePopnSize', 'SENSITIVITY', '__Dispersal distance limited',
'__Dispersal barriers', '__Low genetic diversity', '__clutch size',
'__Gen_Length', 'LOW_ADAPTABILITY', '__EXP_Sea Inundation',
'__EXP_MeanTemperature', '__EXP_AADTemperature', '__EXP_MeanPrecip',
'__EXP_AADPrecip', 'EXPOSURE']

## corals
c_columns = ['FINAL_SCORE', '_Few habitats', '_narrow depth rng',
       '_Larval vulnerabilityH', '_ShallowH', '_PastMortalityH', '_zooxH',
       '_Rare', 'SENSITIVITY', '_Short dispersal distance',
       '_dispersal barriers', '_ColonyMaxAge', '_SlowGrowthRate',
       'LOW_ADAPTABILITY', '_ProportionArag<3', '_MeanDHM', 'EXPOSURE']

In [62]:

# apply summation of HLU to each column, count how many occurences of each H, L, U and percentage
def get_hlu(x):
    return pd.Series({'H': (x=='H').sum(), 'L': (x=='L').sum(), 'U': (x=='U').sum(), 'per_H': (x=='H').sum()/x.size})

# apply get_hlu to each df
def apply_func(df, select_columns):
    # for each column of the df apply a function
    return df[select_columns].apply(get_hlu).T

# ======= GENERIC VERSION ======
# get unique value of the series and count how many occurences of each unique value
def generic_count(x):
    # x is a pd.Series object
    unique_values = x.unique()
    return pd.Series({value: (x==value).sum() for value in unique_values})

def apply_func_mk2(df):
    return df[summary_columnsy_columns].apply(generic_count).T

# ======= about 'apply' =======
## dfgroupby.apply: takes a df as argument(for each sub df defined by groupby criteria)
## dataframe.apply: takes a series as argument (for each of the columns in the df)
## series.apply: takes a value as argument (for each value of the series)

In [63]:

## split into df for each unique wdpaid-en_name combinations
## for each df apply the logic
amp_vv =  result_amp_f.groupby(['wdpaid', 'en_name']).apply(apply_func, a_columns).reset_index()
bird_vv = result_bird_f.groupby(['wdpaid', 'en_name']).apply(apply_func, b_columns).reset_index()
coral_vv = result_coral_f.groupby(['wdpaid', 'en_name']).apply(apply_func, c_columns).reset_index()

# amp_vvv = groups.apply(apply_func_mk2).reset_index()

In [64]:

## spot checks
bird_vv[bird_vv.wdpaid == 191]

Out[64]:

	wdpaid	en_name	level_2	H	L	U	per_H
0	191	Galápagos Islands	FINAL_SCORE	11.0	74.0	0.0	0.129412
1	191	Galápagos Islands	__hab_specialisation	4.0	81.0	0.0	0.047059
2	191	Galápagos Islands	__microhabitat	1.0	84.0	0.0	0.011765
3	191	Galápagos Islands	__ForestDependence	8.0	77.0	0.0	0.094118
4	191	Galápagos Islands	__TemperatureRange	2.0	69.0	14.0	0.023529
5	191	Galápagos Islands	__PrecipRange	20.0	51.0	14.0	0.235294
6	191	Galápagos Islands	__Species dependence	0.0	85.0	0.0	0.000000
7	191	Galápagos Islands	__PopnSize	12.0	56.0	17.0	0.141176
8	191	Galápagos Islands	__EffectivePopnSize	15.0	53.0	17.0	0.176471
9	191	Galápagos Islands	SENSITIVITY	32.0	39.0	14.0	0.376471
10	191	Galápagos Islands	__Dispersal distance limited	24.0	61.0	0.0	0.282353
11	191	Galápagos Islands	__Dispersal barriers	18.0	67.0	0.0	0.211765
12	191	Galápagos Islands	__Low genetic diversity	6.0	79.0	0.0	0.070588
13	191	Galápagos Islands	__clutch size	27.0	52.0	6.0	0.317647
14	191	Galápagos Islands	__Gen_Length	47.0	38.0	0.0	0.552941
15	191	Galápagos Islands	LOW_ADAPTABILITY	59.0	25.0	1.0	0.694118
16	191	Galápagos Islands	__EXP_Sea Inundation	8.0	77.0	0.0	0.094118
17	191	Galápagos Islands	__EXP_MeanTemperature	19.0	51.0	15.0	0.223529
18	191	Galápagos Islands	__EXP_AADTemperature	5.0	65.0	15.0	0.058824
19	191	Galápagos Islands	__EXP_MeanPrecip	12.0	58.0	15.0	0.141176
20	191	Galápagos Islands	__EXP_AADPrecip	6.0	64.0	15.0	0.070588
21	191	Galápagos Islands	EXPOSURE	30.0	40.0	15.0	0.352941

In [65]:

bird_vv[bird_vv.wdpaid == 2008]

Out[65]:

	wdpaid	en_name	level_2	H	L	U	per_H
132	2008	Bia?owie?a Forest	FINAL_SCORE	60.0	119.0	0.0	0.335196
133	2008	Bia?owie?a Forest	__hab_specialisation	0.0	179.0	0.0	0.000000
134	2008	Bia?owie?a Forest	__microhabitat	12.0	167.0	0.0	0.067039
135	2008	Bia?owie?a Forest	__ForestDependence	13.0	166.0	0.0	0.072626
136	2008	Bia?owie?a Forest	__TemperatureRange	0.0	176.0	3.0	0.000000
137	2008	Bia?owie?a Forest	__PrecipRange	152.0	24.0	3.0	0.849162
138	2008	Bia?owie?a Forest	__Species dependence	0.0	179.0	0.0	0.000000
139	2008	Bia?owie?a Forest	__PopnSize	0.0	179.0	0.0	0.000000
140	2008	Bia?owie?a Forest	__EffectivePopnSize	0.0	179.0	0.0	0.000000
141	2008	Bia?owie?a Forest	SENSITIVITY	161.0	18.0	0.0	0.899441
142	2008	Bia?owie?a Forest	__Dispersal distance limited	5.0	174.0	0.0	0.027933
143	2008	Bia?owie?a Forest	__Dispersal barriers	4.0	175.0	0.0	0.022346
144	2008	Bia?owie?a Forest	__Low genetic diversity	1.0	178.0	0.0	0.005587
145	2008	Bia?owie?a Forest	__clutch size	14.0	164.0	1.0	0.078212
146	2008	Bia?owie?a Forest	__Gen_Length	66.0	113.0	0.0	0.368715
147	2008	Bia?owie?a Forest	LOW_ADAPTABILITY	76.0	103.0	0.0	0.424581
148	2008	Bia?owie?a Forest	__EXP_Sea Inundation	0.0	179.0	0.0	0.000000
149	2008	Bia?owie?a Forest	__EXP_MeanTemperature	161.0	15.0	3.0	0.899441
150	2008	Bia?owie?a Forest	__EXP_AADTemperature	0.0	176.0	3.0	0.000000
151	2008	Bia?owie?a Forest	__EXP_MeanPrecip	0.0	176.0	3.0	0.000000
152	2008	Bia?owie?a Forest	__EXP_AADPrecip	0.0	176.0	3.0	0.000000
153	2008	Bia?owie?a Forest	EXPOSURE	161.0	15.0	3.0	0.899441

In [66]:

result_amp_f[result_amp_f.wdpaid == 191]

Out[66]:

	SVP_ID	SIS_GAA_ID	GAA Family	Genus	Species	Fullname	Threatened	SUSC_A_Habitats	SUSC_A_aquatic larvae	SUSC_B_Temperature Range	...	class_name	order_name	family_name	genus_name	species_name	category	biome_marine	biome_freshwater	biome_terrestrial	id_no_int

0 rows × 58 columns

In [67]:

amp_vv[amp_vv.wdpaid == 191]
coral_vv[coral_vv.wdpaid == 191]

Out[67]:

	wdpaid	en_name	level_2	H	L	U	per_H
0	191	Galápagos Islands	FINAL_SCORE	2.0	19.0	0.0	0.095238
1	191	Galápagos Islands	_Few habitats	4.0	17.0	0.0	0.190476
2	191	Galápagos Islands	_narrow depth rng	4.0	16.0	1.0	0.190476
3	191	Galápagos Islands	_Larval vulnerabilityH	2.0	19.0	0.0	0.095238
4	191	Galápagos Islands	_ShallowH	4.0	16.0	1.0	0.190476
5	191	Galápagos Islands	_PastMortalityH	11.0	10.0	0.0	0.523810
6	191	Galápagos Islands	_zooxH	15.0	6.0	0.0	0.714286
7	191	Galápagos Islands	_Rare	3.0	18.0	0.0	0.142857
8	191	Galápagos Islands	SENSITIVITY	21.0	0.0	0.0	1.000000
9	191	Galápagos Islands	_Short dispersal distance	0.0	18.0	3.0	0.000000
10	191	Galápagos Islands	_dispersal barriers	3.0	18.0	0.0	0.142857
11	191	Galápagos Islands	_ColonyMaxAge	0.0	21.0	0.0	0.000000
12	191	Galápagos Islands	_SlowGrowthRate	7.0	14.0	0.0	0.333333
13	191	Galápagos Islands	LOW_ADAPTABILITY	8.0	13.0	0.0	0.380952
14	191	Galápagos Islands	_ProportionArag<3	3.0	18.0	0.0	0.142857
15	191	Galápagos Islands	_MeanDHM	1.0	19.0	1.0	0.047619
16	191	Galápagos Islands	EXPOSURE	3.0	18.0	0.0	0.142857

Having more CCV species, having more CCV species in proportion indicates WH sites, whose species are more susceptible to adverse impact due to climate change.

Questions and statements that can be made?

within WH sites, the top sites re CCV species number and proportion, by taxa. Narrative: more consideration to mitigate adverse effect of CCV to species

In [68]:

# dump for database
amp_vv.to_csv('agg_amp.csv')
bird_vv.to_csv('agg_bird.csv')
coral_vv.to_csv('agg_coral.csv')

result_amp_f.to_csv('wh_amp.csv')
result_bird_f.to_csv('wh_bird.csv')
result_coral_f.to_csv('wh_coral.csv')

3.5 Climate vulnerability inside the entire WH network¶

This section looks at individual taxon within the entire network. A comparison is then made to examine whether or not the WH network house species that are more vulnerable to climate change. Because the result contains many duplicate species rows (due to different sites having the same species), the first step is to get a unique list of species within all WH sites.

In [69]:

# combine species in WH result and all results, by taxa
## get a unique set of species within each taxon
amp_selected = result_amp_f.groupby('id_no_int').first().reset_index()
bird_selected = result_bird_f.groupby('id_no_int').first().reset_index()
coral_selected = result_coral_f.groupby('id_no_int').first().reset_index()

## function to aggregate result and apply hlu function to all columns
def get_agg_result(unique_taxon_result, taxon_columns):
    return unique_taxon_result[taxon_columns].apply(get_hlu).T


amp_total = pd.merge(get_agg_result(amp_selected, a_columns), \
                    get_agg_result(amp, a_columns),
                    left_index = True, right_index=True, suffixes=('_wh', '_all'))

bird_total = pd.merge(get_agg_result(bird_selected, b_columns), \
                    get_agg_result(bird, b_columns),
                    left_index = True, right_index=True, suffixes=('_wh', '_all'))

coral_total = pd.merge(get_agg_result(coral_selected, c_columns), \
                    get_agg_result(coral, c_columns),
                    left_index = True, right_index=True, suffixes=('_wh', '_all'))

In [70]:

amp_total

Out[70]:

	H_wh	L_wh	U_wh	per_H_wh	H_all	L_all	U_all	per_H_all
SUSC_A_Habitats	220.0	1783.0	10.0	0.109290	1509.0	4539.0	156.0	0.243230
SUSC_A_aquatic larvae	330.0	1673.0	10.0	0.163934	955.0	5085.0	164.0	0.153933
SUSC_B_Temperature Range	386.0	1626.0	1.0	0.191754	1519.0	4557.0	128.0	0.244842
SUSC_B_Precipitation Range	479.0	1533.0	1.0	0.237953	1519.0	4557.0	128.0	0.244842
SUSC_C_explosive breeder	162.0	1629.0	222.0	0.080477	316.0	4113.0	1775.0	0.050935
SUSC_D_disease	436.0	1577.0	0.0	0.216592	1307.0	4897.0	0.0	0.210671
SENSITIVITY	1340.0	606.0	67.0	0.665673	4453.0	1365.0	386.0	0.717763
ADAPT_A_barriers	183.0	1661.0	169.0	0.090909	745.0	3900.0	1559.0	0.120084
ADAPT_A_dispersal_distance	501.0	1493.0	19.0	0.248882	1569.0	4522.0	113.0	0.252901
ADAPT_C_Slow_Gen_Turnpover	609.0	586.0	818.0	0.302534	2073.0	899.0	3232.0	0.334139
LOW_ADAPTABILITY	997.0	1007.0	9.0	0.495281	3233.0	2898.0	73.0	0.521115
EXP_Sea Level	0.0	2003.0	10.0	0.000000	4.0	6044.0	156.0	0.000645
EXP_MeanTemperature	315.0	1697.0	1.0	0.156483	1515.0	4544.0	145.0	0.244197
EXP_MeanRainfall	503.0	1509.0	1.0	0.249876	1515.0	4544.0	145.0	0.244197
EXP_AADTemperature	212.0	1800.0	1.0	0.105315	1498.0	4561.0	145.0	0.241457
EXP_AADRainfall	371.0	1641.0	1.0	0.184302	1515.0	4544.0	145.0	0.244197
EXPOSURE	864.0	1145.0	4.0	0.429210	3356.0	2642.0	206.0	0.540941
FINAL_SCORE	301.0	1712.0	0.0	0.149528	1368.0	4836.0	0.0	0.220503

In [71]:

a = amp_total[['H_wh', 'L_wh', 'U_wh']].sum(1)/amp_total[['H_all', 'L_all', 'U_all']].sum(1)
b = pd.concat([amp_total.H_wh/amp_total.H_all, a], 1)
b.columns = ['H_per', 'total_per']
b
# (amp_total.H_wh + amp_total.L_wh + amp_total.U_wh)/(amp_total.H_all + amp_total.L_all + amp_total.U_all)

Out[71]:

	H_per	total_per
SUSC_A_Habitats	0.145792	0.324468
SUSC_A_aquatic larvae	0.345550	0.324468
SUSC_B_Temperature Range	0.254115	0.324468
SUSC_B_Precipitation Range	0.315339	0.324468
SUSC_C_explosive breeder	0.512658	0.324468
SUSC_D_disease	0.333588	0.324468
SENSITIVITY	0.300921	0.324468
ADAPT_A_barriers	0.245638	0.324468
ADAPT_A_dispersal_distance	0.319312	0.324468
ADAPT_C_Slow_Gen_Turnpover	0.293777	0.324468
LOW_ADAPTABILITY	0.308382	0.324468
EXP_Sea Level	0.000000	0.324468
EXP_MeanTemperature	0.207921	0.324468
EXP_MeanRainfall	0.332013	0.324468
EXP_AADTemperature	0.141522	0.324468
EXP_AADRainfall	0.244884	0.324468
EXPOSURE	0.257449	0.324468
FINAL_SCORE	0.220029	0.324468

It appears that also fewer proportion of amphibians in WH sites are climate vulnerable

In [72]:

bird_total

Out[72]:

	H_wh	L_wh	U_wh	per_H_wh	H_all	L_all	U_all	per_H_all
FINAL_SCORE	1336.0	5578.0	0.0	0.193231	2323.0	7533.0	0.0	0.235694
__hab_specialisation	849.0	6059.0	6.0	0.122794	1530.0	8306.0	20.0	0.155235
__microhabitat	703.0	6211.0	0.0	0.101678	1001.0	8855.0	0.0	0.101562
__ForestDependence	1527.0	5386.0	1.0	0.220856	2575.0	7277.0	4.0	0.261262
__TemperatureRange	1068.0	4401.0	1445.0	0.154469	1974.0	6118.0	1764.0	0.200284
__PrecipRange	1414.0	4055.0	1445.0	0.204513	2095.0	5997.0	1764.0	0.212561
__Species dependence	66.0	6848.0	0.0	0.009546	89.0	9767.0	0.0	0.009030
__PopnSize	393.0	1794.0	4727.0	0.056841	1084.0	2319.0	6453.0	0.109984
__EffectivePopnSize	558.0	1629.0	4727.0	0.080706	1410.0	1993.0	6453.0	0.143060
SENSITIVITY	4022.0	585.0	2307.0	0.581718	6290.0	719.0	2847.0	0.638190
__Dispersal distance limited	1180.0	5734.0	0.0	0.170668	1993.0	7863.0	0.0	0.202212
__Dispersal barriers	314.0	6600.0	0.0	0.045415	700.0	9156.0	0.0	0.071023
__Low genetic diversity	41.0	6873.0	0.0	0.005930	69.0	9787.0	0.0	0.007001
__clutch size	1760.0	3213.0	1941.0	0.254556	2414.0	3946.0	3496.0	0.244927
__Gen_Length	1730.0	5184.0	0.0	0.250217	2500.0	7356.0	0.0	0.253653
LOW_ADAPTABILITY	3576.0	2092.0	1246.0	0.517211	5337.0	2507.0	2012.0	0.541498
__EXP_Sea Inundation	100.0	6808.0	6.0	0.014463	163.0	9673.0	20.0	0.016538
__EXP_MeanTemperature	1227.0	4222.0	1465.0	0.177466	1921.0	6066.0	1869.0	0.194907
__EXP_AADTemperature	944.0	4505.0	1465.0	0.136535	1925.0	6062.0	1869.0	0.195312
__EXP_MeanPrecip	1180.0	4269.0	1465.0	0.170668	1998.0	5989.0	1869.0	0.202719
__EXP_AADPrecip	1070.0	4379.0	1465.0	0.154758	2152.0	5835.0	1869.0	0.218344
EXPOSURE	3053.0	2406.0	1455.0	0.441568	4920.0	3082.0	1854.0	0.499188

In [73]:

coral_total

Out[73]:

	H_wh	L_wh	U_wh	per_H_wh	H_all	L_all	U_all	per_H_all
FINAL_SCORE	111.0	582.0	31.0	0.153315	242.0	1198.0	154.0	0.151819
_Few habitats	160.0	564.0	0.0	0.220994	384.0	1210.0	0.0	0.240903
_narrow depth rng	169.0	529.0	26.0	0.233425	384.0	1140.0	70.0	0.240903
_Larval vulnerabilityH	133.0	589.0	2.0	0.183702	274.0	1316.0	4.0	0.171895
_ShallowH	172.0	527.0	25.0	0.237569	376.0	1156.0	62.0	0.235885
_PastMortalityH	312.0	412.0	0.0	0.430939	644.0	950.0	0.0	0.404015
_zooxH	669.0	54.0	1.0	0.924033	1478.0	114.0	2.0	0.927227
_Rare	144.0	575.0	5.0	0.198895	392.0	1190.0	12.0	0.245922
SENSITIVITY	723.0	1.0	0.0	0.998619	1592.0	2.0	0.0	0.998745
_Short dispersal distance	65.0	491.0	168.0	0.089779	144.0	1042.0	408.0	0.090339
_dispersal barriers	83.0	635.0	6.0	0.114641	234.0	1338.0	22.0	0.146801
_ColonyMaxAge	11.0	707.0	6.0	0.015193	26.0	1546.0	22.0	0.016311
_SlowGrowthRate	263.0	455.0	6.0	0.363260	586.0	990.0	18.0	0.367629
LOW_ADAPTABILITY	366.0	356.0	2.0	0.505525	840.0	746.0	8.0	0.526976
_ProportionArag<3	161.0	521.0	42.0	0.222376	354.0	1058.0	182.0	0.222083
_MeanDHM	170.0	513.0	41.0	0.234807	368.0	1036.0	190.0	0.230866
EXPOSURE	249.0	445.0	30.0	0.343923	542.0	894.0	158.0	0.340025

Amphibians and birds in WH sites have consistently lower number of CCV species in proportion, compared to all ampbibians (14.9% to 22%) and birds (19.3% and 23.5%). Lower proportions are also observed in sensitivity, low adaptability and exposure.

Question: what does it mean? what statements can be realistically made?

is it because of WH sites, therefore these species are less vulnerable? Probably no, as these species are not restricted inside WH sites. Thus no statement can made about to justify the positive effect of WH sites.
WH contributes a good deal, as it covers % of H CCV species globally?
what others?

Corals show a consistent trend in proportion in terms of vulnerability, and SAE.

Test statistics for significance

per region (IUCN/UNESCO regions?) / biodiversity sites

In [74]:

amp_total

Out[74]:

	H_wh	L_wh	U_wh	per_H_wh	H_all	L_all	U_all	per_H_all
SUSC_A_Habitats	220.0	1783.0	10.0	0.109290	1509.0	4539.0	156.0	0.243230
SUSC_A_aquatic larvae	330.0	1673.0	10.0	0.163934	955.0	5085.0	164.0	0.153933
SUSC_B_Temperature Range	386.0	1626.0	1.0	0.191754	1519.0	4557.0	128.0	0.244842
SUSC_B_Precipitation Range	479.0	1533.0	1.0	0.237953	1519.0	4557.0	128.0	0.244842
SUSC_C_explosive breeder	162.0	1629.0	222.0	0.080477	316.0	4113.0	1775.0	0.050935
SUSC_D_disease	436.0	1577.0	0.0	0.216592	1307.0	4897.0	0.0	0.210671
SENSITIVITY	1340.0	606.0	67.0	0.665673	4453.0	1365.0	386.0	0.717763
ADAPT_A_barriers	183.0	1661.0	169.0	0.090909	745.0	3900.0	1559.0	0.120084
ADAPT_A_dispersal_distance	501.0	1493.0	19.0	0.248882	1569.0	4522.0	113.0	0.252901
ADAPT_C_Slow_Gen_Turnpover	609.0	586.0	818.0	0.302534	2073.0	899.0	3232.0	0.334139
LOW_ADAPTABILITY	997.0	1007.0	9.0	0.495281	3233.0	2898.0	73.0	0.521115
EXP_Sea Level	0.0	2003.0	10.0	0.000000	4.0	6044.0	156.0	0.000645
EXP_MeanTemperature	315.0	1697.0	1.0	0.156483	1515.0	4544.0	145.0	0.244197
EXP_MeanRainfall	503.0	1509.0	1.0	0.249876	1515.0	4544.0	145.0	0.244197
EXP_AADTemperature	212.0	1800.0	1.0	0.105315	1498.0	4561.0	145.0	0.241457
EXP_AADRainfall	371.0	1641.0	1.0	0.184302	1515.0	4544.0	145.0	0.244197
EXPOSURE	864.0	1145.0	4.0	0.429210	3356.0	2642.0	206.0	0.540941
FINAL_SCORE	301.0	1712.0	0.0	0.149528	1368.0	4836.0	0.0	0.220503

Statistical testing on the finding¶

Is it possible that the statistical differences observed simply reflect the fact that WH network happens to draw a sample from the pool of all species and that the difference is only due to random sampling and nothing else. In other words, the difference is not because of WH capturing species of specific traits but random sampling.

In [75]:

print('amp:', amp.index.size, 'in WH', amp_selected.index.size)
print('bird:', bird.index.size, 'in WH', bird_selected.index.size)
print('coral:', coral_unique.index.size, 'in WH', coral_selected.index.size)

amp: 6204 in WH 2013
bird: 9856 in WH 6914
coral: 797 in WH 724

In [76]:

amp_selected.head()

Out[76]:

	id_no_int	SVP_ID	SIS_GAA_ID	GAA Family	Genus	Species	Fullname	Threatened	SUSC_A_Habitats	SUSC_A_aquatic larvae	...	phylum_name	class_name	order_name	family_name	genus_name	species_name	category	biome_marine	biome_freshwater	biome_terrestrial
0	520.0	4556	2661	LIMNODYNASTIDAE	Adelotus	brevis	Adelotus brevis	NaN	L	L	...	CHORDATA	AMPHIBIA	ANURA	LIMNODYNASTIDAE	Adelotus	brevis	NT	f	t	t
1	596.0	4760	100011	HYPEROLIIDAE	Afrixalus	aureus	Afrixalus aureus	NaN	L	H	...	CHORDATA	AMPHIBIA	ANURA	HYPEROLIIDAE	Afrixalus	aureus	LC	f	t	t
2	1272.0	3629	51121	CRYPTOBRANCHIDAE	Andrias	davidianus	Andrias davidianus	Y	L	L	...	CHORDATA	AMPHIBIA	CAUDATA	CRYPTOBRANCHIDAE	Andrias	davidianus	CR	f	t	f
3	1282.0	3644	13093	PLETHODONTIDAE	Aneides	aeneus	Aneides aeneus	NaN	H	L	...	CHORDATA	AMPHIBIA	CAUDATA	PLETHODONTIDAE	Aneides	aeneus	NT	f	f	t
4	2865.0	3109	15108	BOMBINATORIDAE	Bombina	bombina	Bombina bombina	NaN	L	L	...	CHORDATA	AMPHIBIA	ANURA	BOMBINATORIDAE	Bombina	bombina	LC	f	t	t

5 rows × 58 columns

If I get repeated random samples from the pool, calculate the H_per value and then plot them...

In [151]:

# get a random sample of 2013 repeatedly and calculate H_per
# ===== explanation
# ## get a random sample without replacement
# a = amp[amp.SVP_ID.isin(np.random.choice(amp.SVP_ID, 2013, replace=False))]
# ## apply HLU and get the final score
# a[a_columns].apply(get_hlu).T.ix['FINAL_SCORE'].per_H

test_a = [amp[amp.SVP_ID.isin(np.random.choice(amp.SVP_ID, 2013, replace=True))][a_columns].apply(get_hlu).T.ix['FINAL_SCORE'].per_H\
          for i in range(1000)]

In [152]:

sns.distplot(test_a)

C:\Users\yichuans\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\nonparametric\kdetools.py:20: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  y = X[:m/2+1] + np.r_[0,X[m/2+1:],0]*1j

Out[152]:

<matplotlib.axes._subplots.AxesSubplot at 0x26eb8208>

In [153]:

amp_total.ix['FINAL_SCORE'].per_H_wh

Out[153]:

0.14952806756085443

It is quite clear the WH result is an outlier - a signficant difference from the empirical distribution.

Try birds and corals

In [154]:

test_b = [bird[bird.SVP_ID.isin(np.random.choice(bird.SVP_ID, 6914, replace=True))][b_columns].apply(get_hlu).T.ix['FINAL_SCORE'].per_H\
          for i in range(1000)]

In [155]:

sns.distplot(test_b)

C:\Users\yichuans\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\nonparametric\kdetools.py:20: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  y = X[:m/2+1] + np.r_[0,X[m/2+1:],0]*1j

Out[155]:

<matplotlib.axes._subplots.AxesSubplot at 0x1deba358>

In [156]:

bird_total.ix['FINAL_SCORE'].per_H_wh

Out[156]:

0.19323112525310962

Significant result for birds too

In [157]:

test_c = [coral_unique[coral_unique.Fullname.isin(np.random.choice(coral_unique.Fullname, 724, replace=True))][c_columns].apply(get_hlu).T.ix['FINAL_SCORE'].per_H\
          for i in range(1000)]

In [158]:

sns.distplot(test_c)

C:\Users\yichuans\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\nonparametric\kdetools.py:20: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  y = X[:m/2+1] + np.r_[0,X[m/2+1:],0]*1j

Out[158]:

<matplotlib.axes._subplots.AxesSubplot at 0x20ed1908>

In [159]:

coral_total.ix['FINAL_SCORE'].per_H_wh

Out[159]:

0.15331491712707182

In [160]:

sns.distplot(test_c)

C:\Users\yichuans\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\nonparametric\kdetools.py:20: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  y = X[:m/2+1] + np.r_[0,X[m/2+1:],0]*1j

Out[160]:

<matplotlib.axes._subplots.AxesSubplot at 0x26efcb00>

WH vs all species¶

In [235]:

# a function to plot all columns using stacked horizontal barchart
def plot_comparison(wh_species, all_species, columns):
    my_colors = ['#b2182b', '#ffffff', '#e0e0e0']
    col = len(columns)
    fig, axes = plt.subplots(col, figsize=(15, 2*col)) # len and heihgt
    
    fig.subplots_adjust(hspace = 0.5)
    
    # get agg count of HLUso
    a = get_agg_result(wh_species, columns)
    b = get_agg_result(all_species, columns)
        
    for i, column in enumerate(columns):
        data = pd.concat([a[['H', 'L', 'U']].loc[column], 
                      b[['H', 'L', 'U']].loc[column]], 1).T
        data.index = ['Species in WH sites', 'All species']
        c = data.plot.barh(ax=axes[i], stacked=True, color=my_colors)
        axes[i].set_xlim([0, all_species.index.size*1.2])
        axes[i].set_title(column)
    
        # annotation
        for bar, y_axis in zip([a,b], range(2)):
            c.annotate('{0:.0f}'.format(bar.loc[column].H) + ', ' + '{0:.0f}'.format(bar.loc[column].H + bar.loc[column].L +bar.loc[column].U) + \
                           ', ' + '{0:.1%}'.format(bar.loc[column].per_H),
                          (bar.loc[column].sum() + 10, y_axis))
    return fig

# a function to plot all columns using stacked horizontal barchart
def plot_comparison_sle(wh_species, all_species, columns = ['FINAL_SCORE', 'SENSITIVITY', 'LOW_ADAPTABILITY', 'EXPOSURE']):
    my_colors = ['#b2182b', '#ffffff', '#e0e0e0']
    col = len(columns)
    fig, axes = plt.subplots(col, figsize=(15, 2*col)) # len and heihgt
    
    fig.subplots_adjust(hspace = 0.5)
    
    # get agg count of HLUso
    a = get_agg_result(wh_species, columns)
    b = get_agg_result(all_species, columns)
        
    for i, column in enumerate(columns):
        data = pd.concat([a[['H', 'L', 'U']].loc[column], 
                      b[['H', 'L', 'U']].loc[column]], 1).T
        data.index = ['Species in WH sites', 'All species']
        c = data.plot.barh(ax=axes[i], stacked=True, color=my_colors)
        axes[i].set_xlim([0, all_species.index.size*1.2])
        axes[i].set_title(column)
        
        # annotation
        for bar, y_axis in zip([a,b], range(2)):
            c.annotate('{0:.0f}'.format(bar.loc[column].H) + ', ' + '{0:.0f}'.format(bar.loc[column].H + bar.loc[column].L +bar.loc[column].U) + \
                           ', ' + '{0:.1%}'.format(bar.loc[column].per_H),
                          (bar.loc[column].sum() + 10, y_axis))

        
    return fig

Detailed comparisons across traits and exposures for amphibians

In [260]:

bird[['FINAL_SCORE', 'SENSITIVITY', 'LOW_ADAPTABILITY', 'EXPOSURE']]

Out[260]:

	FINAL_SCORE	SENSITIVITY	LOW_ADAPTABILITY	EXPOSURE
0	H	H	H	H
1	L	H	L	U
2	L	H	H	U
3	L	H	L	U
4	L	H	U	L
5	L	H	H	L
6	H	H	H	H
7	L	U	L	H
8	L	H	L	L
9	L	H	L	L
10	L	H	L	H
11	L	H	H	U
12	L	U	L	U
13	L	H	H	L
14	H	H	H	H
15	H	H	H	H
16	L	U	L	U
17	L	H	H	L
18	L	U	L	U
19	L	U	L	U
20	L	U	L	U
21	H	H	H	H
22	L	H	L	L
23	L	H	L	H
24	H	H	H	H
25	L	L	H	L
26	L	L	H	L
27	H	H	H	H
28	H	H	H	H
29	H	H	H	H
...	...	...	...	...
9826	L	H	U	H
9827	H	H	H	H
9828	H	H	H	H
9829	H	H	H	H
9830	L	H	H	U
9831	H	H	H	H
9832	L	H	U	U
9833	L	H	H	U
9834	L	H	H	U
9835	H	H	H	H
9836	L	H	L	H
9837	L	U	L	L
9838	L	U	L	L
9839	H	H	H	H
9840	H	H	H	H
9841	L	H	H	U
9842	H	H	H	H
9843	L	H	H	U
9844	L	H	U	H
9845	L	H	U	H
9846	L	U	L	L
9847	H	H	H	H
9848	L	H	H	U
9849	L	H	U	H
9850	L	H	H	U
9851	L	H	U	L
9852	H	H	H	H
9853	H	H	H	H
9854	L	H	U	H
9855	H	H	H	H

9856 rows × 4 columns

In [253]:

# amp
plot_comparison(amp_selected, amp, a_columns);

In [268]:

a = plot_comparison_sle(amp_selected, amp);
a.savefig('wh_network_amp.png', dpi=100, bbox_inches='tight')

Detailed comparisons across traits and exposures for birds

In [238]:

# bird
plot_comparison(bird_selected, bird, b_columns);

In [267]:

a = plot_comparison_sle(bird_selected, bird);
a.savefig('wh_network_bird.png', dpi=100, bbox_inches='tight')

[!Insignificant findings] Detailed comparisons across traits and exposures for amphibians

In [240]:

# coral
plot_comparison(coral_selected, coral_unique, c_columns);

WH vs bioWH¶

It is quite interesting to further examine subsets of WH sites, for example, by regions and by criteria, with a view to answering questions such as a) do biodiversity WH sites have more climate vulnerable species b) do WH sites in asian regions host more CV species?

In [91]:

type(amp_selected.crit)

Out[91]:

pandas.core.series.Series

In [168]:

# try testing regular expression and `pd.series.str.match`
pattern = r'.*x.*' # crit having character `x` indicates a biodiversity WH site
test_matching = pd.concat([amp_selected.crit, amp_selected.crit.str.match(pattern)], 1)

# get a filter for biodi
amp_wh_bio_filter = amp_selected.crit.str.match(pattern)
bird_wh_bio_filter = bird_selected.crit.str.match(pattern)
coral_wh_bio_filter = coral_selected.crit.str.match(pattern)

# reduction in the number of WH sites
print('all WH', 'only bWH', 'non bWH')
print(amp_selected.wdpaid.unique().size, amp_selected[amp_wh_bio_filter].wdpaid.unique().size, amp_selected[~amp_wh_bio_filter].wdpaid.unique().size,)
print(bird_selected.wdpaid.unique().size, bird_selected[bird_wh_bio_filter].wdpaid.unique().size, bird_selected[~bird_wh_bio_filter].wdpaid.unique().size)
print(coral_selected.wdpaid.unique().size, coral_selected[coral_wh_bio_filter].wdpaid.unique().size)

all WH only bWH non bWH
150 116 34
184 147 37
26 23

In [163]:

# do comparison between subset of 
amp_wh_bio = pd.merge(amp_selected[amp_wh_bio_filter][a_columns].apply(get_hlu).T, \
         amp_selected[~amp_wh_bio_filter][a_columns].apply(get_hlu).T, \
         left_index = True, right_index=True, suffixes=('_bio', '_wh'))

bird_wh_bio = pd.merge(bird_selected[bird_wh_bio_filter][b_columns].apply(get_hlu).T, \
         bird_selected[~bird_wh_bio_filter][b_columns].apply(get_hlu).T, \
         left_index = True, right_index=True, suffixes=('_bio', '_wh'))

coral_wh_bio = pd.merge(coral_selected[coral_wh_bio_filter][c_columns].apply(get_hlu).T, \
         coral_selected[c_columns].apply(get_hlu).T, \
         left_index = True, right_index=True, suffixes=('_bio', '_wh'))

In [241]:

def plot_comparison_sle_bio(wh_species, all_species, columns = ['FINAL_SCORE', 'SENSITIVITY', 'LOW_ADAPTABILITY', 'EXPOSURE']):
    """<non-bio>, <bio WH sites>"""
    my_colors = ['#b2182b', '#ffffff', '#e0e0e0']
    col = len(columns)
    fig, axes = plt.subplots(col, figsize=(15, 2*col)) # len and heihgt
    
    fig.subplots_adjust(hspace = 0.5)
    
    # get agg count of HLUso
    a = get_agg_result(wh_species, columns)
    b = get_agg_result(all_species, columns)
        
    for i, column in enumerate(columns):
        data = pd.concat([a[['H', 'L', 'U']].loc[column], 
                      b[['H', 'L', 'U']].loc[column]], 1).T
        data.index = ['Other natural WH sites', 'Biodiversity WH sites']
        c = data.plot.barh(ax=axes[i], stacked=True, color=my_colors)
        axes[i].set_xlim([0, all_species.index.size*1.2])
        axes[i].set_title(column)
        
        # annotation
        for bar, y_axis in zip([a,b], range(2)):
            c.annotate('{0:.0f}'.format(bar.loc[column].H) + ', ' + '{0:.0f}'.format(bar.loc[column].H + bar.loc[column].L +bar.loc[column].U) + \
                           ', ' + '{0:.1%}'.format(bar.loc[column].per_H),
                          (bar.loc[column].sum() + 10, y_axis))

        
    return fig

In [266]:

a = plot_comparison_sle_bio(amp_selected[~amp_wh_bio_filter], amp_selected[amp_wh_bio_filter]);
a.savefig('network_bio_amp.png', dpi=100, bbox_inches='tight')

In [243]:

amp_wh_bio

Out[243]:

	H_bio	L_bio	U_bio	per_H_bio	H_wh	L_wh	U_wh	per_H_wh
SUSC_A_Habitats	215.0	1664.0	9.0	0.113877	5.0	119.0	1.0	0.040
SUSC_A_aquatic larvae	293.0	1586.0	9.0	0.155191	37.0	87.0	1.0	0.296
SUSC_B_Temperature Range	385.0	1502.0	1.0	0.203919	1.0	124.0	0.0	0.008
SUSC_B_Precipitation Range	423.0	1464.0	1.0	0.224047	56.0	69.0	0.0	0.448
SUSC_C_explosive breeder	151.0	1526.0	211.0	0.079979	11.0	103.0	11.0	0.088
SUSC_D_disease	412.0	1476.0	0.0	0.218220	24.0	101.0	0.0	0.192
SENSITIVITY	1260.0	563.0	65.0	0.667373	80.0	43.0	2.0	0.640
ADAPT_A_barriers	174.0	1548.0	166.0	0.092161	9.0	113.0	3.0	0.072
ADAPT_A_dispersal_distance	484.0	1385.0	19.0	0.256356	17.0	108.0	0.0	0.136
ADAPT_C_Slow_Gen_Turnpover	584.0	532.0	772.0	0.309322	25.0	54.0	46.0	0.200
LOW_ADAPTABILITY	957.0	922.0	9.0	0.506886	40.0	85.0	0.0	0.320
EXP_Sea Level	0.0	1879.0	9.0	0.000000	0.0	124.0	1.0	0.000
EXP_MeanTemperature	273.0	1614.0	1.0	0.144597	42.0	83.0	0.0	0.336
EXP_MeanRainfall	500.0	1387.0	1.0	0.264831	3.0	122.0	0.0	0.024
EXP_AADTemperature	201.0	1686.0	1.0	0.106462	11.0	114.0	0.0	0.088
EXP_AADRainfall	364.0	1523.0	1.0	0.192797	7.0	118.0	0.0	0.056
EXPOSURE	814.0	1070.0	4.0	0.431144	50.0	75.0	0.0	0.400
FINAL_SCORE	291.0	1597.0	0.0	0.154131	10.0	115.0	0.0	0.080

In [264]:

a = plot_comparison_sle_bio(bird_selected[~bird_wh_bio_filter], bird_selected[bird_wh_bio_filter]);
a.savefig('network_bio_bird.png', dpi=100, bbox_inches='tight')

In [245]:

bird_wh_bio

Out[245]:

	H_bio	L_bio	U_bio	per_H_bio	H_wh	L_wh	U_wh	per_H_wh
FINAL_SCORE	1208.0	4834.0	0.0	0.199934	128.0	744.0	0.0	0.146789
__hab_specialisation	777.0	5259.0	6.0	0.128600	72.0	800.0	0.0	0.082569
__microhabitat	644.0	5398.0	0.0	0.106587	59.0	813.0	0.0	0.067661
__ForestDependence	1419.0	4622.0	1.0	0.234856	108.0	764.0	0.0	0.123853
__TemperatureRange	1062.0	3779.0	1201.0	0.175770	6.0	622.0	244.0	0.006881
__PrecipRange	1046.0	3795.0	1201.0	0.173121	368.0	260.0	244.0	0.422018
__Species dependence	66.0	5976.0	0.0	0.010924	0.0	872.0	0.0	0.000000
__PopnSize	348.0	1446.0	4248.0	0.057597	45.0	348.0	479.0	0.051606
__EffectivePopnSize	493.0	1301.0	4248.0	0.081595	65.0	328.0	479.0	0.074541
SENSITIVITY	3494.0	521.0	2027.0	0.578285	528.0	64.0	280.0	0.605505
__Dispersal distance limited	1084.0	4958.0	0.0	0.179411	96.0	776.0	0.0	0.110092
__Dispersal barriers	269.0	5773.0	0.0	0.044522	45.0	827.0	0.0	0.051606
__Low genetic diversity	31.0	6011.0	0.0	0.005131	10.0	862.0	0.0	0.011468
__clutch size	1657.0	2585.0	1800.0	0.274247	103.0	628.0	141.0	0.118119
__Gen_Length	1512.0	4530.0	0.0	0.250248	218.0	654.0	0.0	0.250000
LOW_ADAPTABILITY	3238.0	1669.0	1135.0	0.535915	338.0	423.0	111.0	0.387615
__EXP_Sea Inundation	91.0	5945.0	6.0	0.015061	9.0	863.0	0.0	0.010321
__EXP_MeanTemperature	967.0	3854.0	1221.0	0.160046	260.0	368.0	244.0	0.298165
__EXP_AADTemperature	892.0	3929.0	1221.0	0.147633	52.0	576.0	244.0	0.059633
__EXP_MeanPrecip	1092.0	3729.0	1221.0	0.180735	88.0	540.0	244.0	0.100917
__EXP_AADPrecip	986.0	3835.0	1221.0	0.163191	84.0	544.0	244.0	0.096330
EXPOSURE	2674.0	2156.0	1212.0	0.442569	379.0	250.0	243.0	0.434633

It appears that subdivision by separating biodiversity sites from all natural sites does not indicate a different picture. The percentage difference is minute.

WH by regions¶

Check differences according to regions

In [97]:

# load regions
region = pd.read_csv('region.csv')

In [98]:

region.unesco_reg.unique()

Out[98]:

array(['Africa', 'Arab States', 'Asia and the Pacific',
       'Europe and North America', 'Latin America and the Caribbean'], dtype=object)

In [99]:

# convenient variables
regions_list = ['_africa', '_asia', '_euna', '_lac', '_arab']
regions_name_list = ['Africa', 'Asia and the Pacific', 'Europe and North America', 'Latin America and the Caribbean', 'Arab States']

In [100]:

# africa_ids = region[region.unesco_reg == 'Africa'].wdpaid
# asia_ids = region[region.unesco_reg == 'Asia and the Pacific'].wdpaid
# euna_ids = region[region.unesco_reg == 'Europe and North America'].wdpaid
# lac_ids = region[region.unesco_reg == 'Latin America and the Caribbean'].wdpaid
# arab_ids = region[region.unesco_reg == 'Arab States'].wdpaid

# more concise version using list comprehension
africa_ids, asia_ids, euna_ids, lac_ids, arab_ids = [region[region.unesco_reg == region_name].wdpaid for region_name in \
                                                     regions_name_list]

In [101]:

print([len(x) for x in [africa_ids, asia_ids, euna_ids, lac_ids, arab_ids]])

[41, 70, 71, 41, 6]

In [102]:

# get a function that returns the result of aggregates
def get_agg_result_by_wdpaidlist(taxon_result, taxon_columns, filtered_wdpaids):
    # get a unique list of species for sites, based on the filtered wdpaids
    unique_taxon_result = taxon_result[taxon_result.wdpaid.isin(filtered_wdpaids)].groupby('id_no_int').first().reset_index()
    # apply the stangdard aggregation methods
    return unique_taxon_result[taxon_columns].apply(get_hlu).T

In [103]:

# amp in wh sites by region
amp_africa, amp_asia, amp_euna, amp_lac, amp_arab = \
[get_agg_result_by_wdpaidlist(result_amp_f, a_columns, a_region) \
 for a_region in [africa_ids, asia_ids, euna_ids, lac_ids, arab_ids]]
amp_regions = [amp_africa, amp_asia, amp_euna, amp_lac, amp_arab]

# bird in wh sites by region
bird_africa, bird_asia, bird_euna, bird_lac, bird_arab = \
[get_agg_result_by_wdpaidlist(result_bird_f, b_columns, a_region) \
 for a_region in [africa_ids, asia_ids, euna_ids, lac_ids, arab_ids]]
bird_regions = [bird_africa, bird_asia, bird_euna, bird_lac, bird_arab]

# coral in wh sites by region
coral_africa, coral_asia, coral_euna, coral_lac, coral_arab = \
[get_agg_result_by_wdpaidlist(result_coral_f, c_columns, a_region) \
 for a_region in [africa_ids, asia_ids, euna_ids, lac_ids, arab_ids]]
coral_regions = [coral_africa, coral_asia, coral_euna, coral_lac, coral_arab]

In [104]:

# join all the regional tables together
from functools import reduce

# use reduce function to merge any given number of dfs; here i used a tuple in order to get suffix information
# the first element is the df, the second a dummy value used in the reduce process
amp_region_final = reduce(lambda left, right: (pd.merge(left[0], right[0], left_index=True, right_index=True, \
                                              suffixes = ('', right[1])), 0), \
                  zip(amp_regions, regions_list),
                 (amp_selected[a_columns].apply(get_hlu).T, 0))[0]

bird_region_final = reduce(lambda left, right: (pd.merge(left[0], right[0], left_index=True, right_index=True, \
                                              suffixes = ('', right[1])), 0), \
                  zip(bird_regions, regions_list),
                 (bird_selected[b_columns].apply(get_hlu).T, 0))[0]

coral_region_final = reduce(lambda left, right: (pd.merge(left[0], right[0], left_index=True, right_index=True, \
                                              suffixes = ('', right[1])), 0), \
                  zip(coral_regions, regions_list),
                 (coral_selected[c_columns].apply(get_hlu).T, 0))[0]

In [105]:

amp_region_final[['per_H' + each for each in ['', '_africa', '_asia', '_euna', '_lac', '_arab']]]

Out[105]:

	per_H	per_H_africa	per_H_asia	per_H_euna	per_H_lac	per_H_arab
SUSC_A_Habitats	0.109290	0.089524	0.067434	0.068966	0.165975	0.000000
SUSC_A_aquatic larvae	0.163934	0.278095	0.139803	0.183908	0.095436	0.666667
SUSC_B_Temperature Range	0.191754	0.241905	0.069079	0.000000	0.300138	0.000000
SUSC_B_Precipitation Range	0.237953	0.184762	0.231908	0.839080	0.147994	1.000000
SUSC_C_explosive breeder	0.080477	0.160000	0.059211	0.045977	0.052559	0.333333
SUSC_D_disease	0.216592	0.019048	0.185855	0.126437	0.410788	0.000000
SENSITIVITY	0.665673	0.636190	0.483553	0.885057	0.791148	1.000000
ADAPT_A_barriers	0.090909	0.093333	0.144737	0.045977	0.053942	0.166667
ADAPT_A_dispersal_distance	0.248882	0.264762	0.199013	0.143678	0.302905	0.000000
ADAPT_C_Slow_Gen_Turnpover	0.302534	0.346667	0.250000	0.327586	0.302905	0.000000
LOW_ADAPTABILITY	0.495281	0.533333	0.447368	0.442529	0.515906	0.166667
EXP_Sea Level	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000
EXP_MeanTemperature	0.156483	0.034286	0.197368	0.189655	0.219917	0.166667
EXP_MeanRainfall	0.249876	0.268571	0.100329	0.028736	0.408022	0.333333
EXP_AADTemperature	0.105315	0.045714	0.088816	0.017241	0.181189	0.000000
EXP_AADRainfall	0.184302	0.100952	0.208882	0.074713	0.246196	0.333333
EXPOSURE	0.429210	0.306667	0.417763	0.270115	0.576763	0.500000
FINAL_SCORE	0.149528	0.091429	0.101974	0.097701	0.246196	0.166667

In [106]:

amp_region_final.loc['FINAL_SCORE']

Out[106]:

H                301.000000
L               1712.000000
U                  0.000000
per_H              0.149528
H_africa          48.000000
L_africa         477.000000
U_africa           0.000000
per_H_africa       0.091429
H_asia            62.000000
L_asia           546.000000
U_asia             0.000000
per_H_asia         0.101974
H_euna            17.000000
L_euna           157.000000
U_euna             0.000000
per_H_euna         0.097701
H_lac            178.000000
L_lac            545.000000
U_lac              0.000000
per_H_lac          0.246196
H_arab             1.000000
L_arab             5.000000
U_arab             0.000000
per_H_arab         0.166667
Name: FINAL_SCORE, dtype: float64

In [107]:

bird_region_final.loc['FINAL_SCORE']

Out[107]:

H               1336.000000
L               5578.000000
U                  0.000000
per_H              0.193231
H_africa         209.000000
L_africa        1410.000000
U_africa           0.000000
per_H_africa       0.129092
H_asia           457.000000
L_asia          2117.000000
U_asia             0.000000
per_H_asia         0.177545
H_euna           212.000000
L_euna           868.000000
U_euna             0.000000
per_H_euna         0.196296
H_lac            703.000000
L_lac           2120.000000
U_lac              0.000000
per_H_lac          0.249026
H_arab            71.000000
L_arab           226.000000
U_arab             0.000000
per_H_arab         0.239057
Name: FINAL_SCORE, dtype: float64

In [108]:

coral_region_final.loc['FINAL_SCORE']

Out[108]:

H               111.000000
L               582.000000
U                31.000000
per_H             0.153315
H_africa         20.000000
L_africa        281.000000
U_africa          4.000000
per_H_africa      0.065574
H_asia           60.000000
L_asia          543.000000
U_asia            5.000000
per_H_asia        0.098684
H_euna           43.000000
L_euna          395.000000
U_euna            4.000000
per_H_euna        0.097285
H_lac            30.000000
L_lac            54.000000
U_lac             1.000000
per_H_lac         0.352941
H_arab           12.000000
L_arab          197.000000
U_arab           22.000000
per_H_arab        0.051948
Name: FINAL_SCORE, dtype: float64

It is very difficult to find patterns in a table full of numbers like the above. Graphs may present a much better view in terms of the underlying differences.

In [109]:

# amp
# a = pd.concat([each[['H', 'L', 'U']][-1:] for each in amp_regions])
## a better way to slice columns and rows, loc is a label based indxing
## .loc returns a series, concatenate on axis 1, transpose and then replace index, so that they'll show up as labels

# get df for all regions, with HLU
a = pd.concat([each[['H', 'L', 'U']].loc['FINAL_SCORE'] for each in amp_regions], 1).T

a.index = regions_name_list
b = a.plot.barh(stacked=True, figsize=(15,5))

# for each bar, annotate num of 'H' and percentage of 'H'
for region_name, region_data, y_axis in zip(regions_name_list, amp_regions, range(5)):
    b.annotate('{0:.0f}'.format(a.loc[region_name].H) + ', ' + '{0:.0f}'.format(a.loc[region_name].sum()) + ', ' + '{0:.2%}'.format(region_data.loc['FINAL_SCORE'].per_H),
              (a.loc[region_name].sum() + 10, y_axis)) # add a buf-space for x

In [110]:

## find amphibians species in arab states WH sites
pd.set_option('display.max_columns', 60)
example = result_amp_f[result_amp_f.wdpaid.isin(arab_ids)]
example[example.FINAL_SCORE=='H']

Out[110]:

	SVP_ID	SIS_GAA_ID	GAA Family	Genus	Species	Fullname	Threatened	SUSC_A_Habitats	SUSC_A_aquatic larvae	SUSC_B_Temperature Range	SUSC_B_Precipitation Range	SUSC_C_explosive breeder	SUSC_D_disease	SENSITIVITY	ADAPT_A_barriers	ADAPT_A_dispersal_distance	ADAPT_C_Slow_Gen_Turnpover	LOW_ADAPTABILITY	EXP_Sea Level	EXP_MeanTemperature	EXP_MeanRainfall	EXP_AADTemperature	EXP_AADRainfall	EXPOSURE	FINAL_SCORE	Final_Pessimistic	Score_HorS	SensXLAdaptability	SensXExposure	LAdaptXExposure	Threatened_and_Vulnerable	Unnamed: 0	wdpaid	id_no	areakm2_x	areakm2_y	per	en_name	fr_name	status_yr	rep_area	gis_area	country	crit	areakm2	binomial	kingdom_name	phylum_name	class_name	order_name	family_name	genus_name	species_name	category	biome_marine	biome_freshwater	biome_terrestrial	id_no_int
1132	3152	6187	ALYTIDAE	Discoglossus	pictus	Discoglossus pictus	NaN	L	L	L	H	L	L	H	H	L	L	H	L	L	H	L	H	H	H	H	H	H	NaN	NaN	NaN	126160	4322	55270.0	124.424076	124.424076	1.0	Ichkeul National Park	Parc national de l'Ichkeul	1980	126.0	124.424076	TUN	(x)	124.424076	Discoglossus pictus	ANIMALIA	CHORDATA	AMPHIBIA	ANURA	ALYTIDAE	Discoglossus	pictus	LC	f	t	t	55270.0

Latin America and the Caribbean region contains the highest number of amphibians that are climate vulnerable (178, out of a total of 723, 24.62%), compared to about 10% for WH sites in other regions. Notably on the other end of the spectrum, WH sites in the Arab States region host very few ampbibans and only one of them is climate vulnerable (Discoglossus pictus in Ichkeul National Park, based on the IUCN Red List data).

In [111]:

# bird
a = pd.concat([each[['H', 'L', 'U']].loc['FINAL_SCORE'] for each in bird_regions], 1).T
a.index = regions_name_list
b = a.plot.barh(stacked=True, figsize=(15,5))
# # testing annotation, note y axis is from bottom up. unit based on data

# for each bar, annotate num of 'H' and percentage of 'H'
for region_name, region_data, y_axis in zip(regions_name_list, bird_regions, range(5)):
    b.annotate('{0:.0f}'.format(a.loc[region_name].H) + ', ' + '{0:.2%}'.format(region_data.loc['FINAL_SCORE'].per_H),
              (a.loc[region_name].sum() + 10, y_axis)) # add a buf-space for x

Similarly from the point of view of climate vulnerable birds in global WH network, Latin America and the Caribbean region has the largest number of birds, and a quarter of them are climate vulnerable, compared to other regions.

In [112]:

# coral
a = pd.concat([each[['H', 'L', 'U']].loc['FINAL_SCORE'] for each in coral_regions], 1).T
a.index = regions_name_list
b = a.plot.barh(stacked=True, figsize=(15,5))

# for each bar, annotate num of 'H' and percentage of 'H'
for region_name, region_data, y_axis in zip(regions_name_list, coral_regions, range(5)):
    b.annotate('{0:.0f}'.format(a.loc[region_name].H) + ', ' + '{0:.2%}'.format(region_data.loc['FINAL_SCORE'].per_H),
              (a.loc[region_name].sum() + 10, y_axis)) # add a buf-space for x

The above regional comparisons across amphibians, birds and corals indicate a large variation in the number and proportion of cimate vulnerable species in the World Heritage network.

More visualisations on the proportions beyond just the final vulnerability score...¶

Need to apply the same methodology across all attributes, i.e. sensitivity, low-adaptability and each of the traits...

In [113]:

# enumerate to get index number, which is needed in the plot
for i, column in enumerate(a_columns):
    print(i, column)

0 SUSC_A_Habitats
1 SUSC_A_aquatic larvae
2 SUSC_B_Temperature Range
3 SUSC_B_Precipitation Range
4 SUSC_C_explosive breeder
5 SUSC_D_disease
6 SENSITIVITY
7 ADAPT_A_barriers
8 ADAPT_A_dispersal_distance
9 ADAPT_C_Slow_Gen_Turnpover
10 LOW_ADAPTABILITY
11 EXP_Sea Level
12 EXP_MeanTemperature
13 EXP_MeanRainfall
14 EXP_AADTemperature
15 EXP_AADRainfall
16 EXPOSURE
17 FINAL_SCORE

In [246]:

# a function to plot all columns using stacked horizontal barchart
def plot_all_attributes(taxon_regions, region_namelist, columns):
    my_colors = ['#b2182b', '#ffffff', '#e0e0e0']
    
    # create a figure with correct number of axes 
    col = len(columns)
    fig, axes = plt.subplots(nrows = col, figsize=(15, 5*col)) # width and heihgt
    fig.subplots_adjust(hspace = 0.3)
    
    # for each exes, plot a column result for all regions, for comparisons
    for i, column in enumerate(columns):
        # a df of all regions HLUs
        a = pd.concat([each[['H', 'L', 'U']].loc[column] for each in taxon_regions], 1).T
        a.index = region_namelist
        
        # so it also works for col=1. If col=1, axes is not an array and thus does not support indexing
        if col>1:
            b = a.plot.barh(ax=axes[i], stacked=True, color=my_colors)
            axes[i].set_title(column)
        else:
            b = a.plot.barh(ax=axes, stacked=True, color=my_colors)
            axes.set_title(column)            
    
        # add num of H, total num and percentage as annotation
        for region_name, region_data, y_axis in zip(region_namelist, taxon_regions, range(len(region_namelist))):
            b.annotate('{0:.0f}'.format(a.loc[region_name].H) + ', ' + '{0:.0f}'.format(a.loc[region_name].H + a.loc[region_name].L + a.loc[region_name].U) + \
                       ', ' + '{0:.2%}'.format(region_data.loc[column].per_H),
                      (a.loc[region_name].sum() + 10, y_axis))
    
    return fig

In [263]:

# need to add `;` to suppress the automatic behaviour that plots the same graph twice
a = plot_all_attributes(amp_regions, regions_name_list, ['FINAL_SCORE', 'SENSITIVITY', 'LOW_ADAPTABILITY', 'EXPOSURE']);
a.savefig('region_amp.png', dpi=100, bbox_inches='tight')

Detailed regional comparisons across traits and exposures for amphibians

In [248]:

# need to add `;` to suppress the automatic behaviour that plots the same graph twice
plot_all_attributes(amp_regions, regions_name_list, a_columns);

In [249]:

# test for single column and it also works now
plot_all_attributes(amp_regions, regions_name_list, ['FINAL_SCORE']);

In [269]:

a = plot_all_attributes(bird_regions, regions_name_list, ['FINAL_SCORE', 'SENSITIVITY', 'LOW_ADAPTABILITY', 'EXPOSURE']);
a.savefig('region_bird.png', dpi=100, bbox_inches='tight')

Detailed regional comparisons across traits and exposures for birds

In [251]:

plot_all_attributes(bird_regions, regions_name_list, b_columns);

In [252]:

plot_all_attributes(coral_regions, regions_name_list, c_columns);

Sensitivity seems rather weird - all species across all regions have sensitivity high?

In [119]:

(coral_unique.SENSITIVITY == 'H').sum(), coral_unique.index.size

Out[119]:

(796, 797)

ranking to determine whether a site should be High or Low in climate vulnerability¶

The top 25% of WH sites both in terms of percentage and total number of highly vulnerable species for a given taxon are labelled as 'highly vulnerable' for that taxon.

In [120]:

# theshold value according to the number of H species
thres_total = [np.percentile(each[each.level_2 == 'FINAL_SCORE'].H, 75) for each in [amp_vv, bird_vv, coral_vv]]

In [121]:

# treshold value according to the percentage of H in relation to total HLU
thres_per = [np.percentile(each[each.level_2 == 'FINAL_SCORE'].per_H, 75) for each in [amp_vv, bird_vv, coral_vv]]

In [122]:

thres_per, thres_total

Out[122]:

([0.14285714285714285, 0.28225806451612906, 0.14732142857142858],
 [1.0, 47.0, 27.0])

In [123]:

# the number of sites as 'highly vulnerable' for amp
((amp_vv[amp_vv.level_2=='FINAL_SCORE'].H > thres_total[0]) & (amp_vv[amp_vv.level_2=='FINAL_SCORE'].per_H > thres_per[0])).sum()

Out[123]:

In [124]:

(amp_vv[amp_vv.level_2=='FINAL_SCORE'].H > thres_total[0]).sum(), (amp_vv[amp_vv.level_2=='FINAL_SCORE'].per_H > thres_per[0]).sum()

Out[124]:

(42, 47)

In [125]:

# amp, bird, coral number of cv sites
[((each[each.level_2=='FINAL_SCORE'].H > thres_total[i]) & (each[each.level_2=='FINAL_SCORE'].per_H > thres_per[i])).sum() for i, each in enumerate([amp_vv, bird_vv, coral_vv])]

Out[125]:

[30, 28, 1]

In [ ]: