World Bank Indicators Data

This notebook provides a brief demonstration of how to access the World Bank Indicators data using pandas.

A wrapper for the API is provided as part of the main pandas distribution, as part of the Remote Data Access support.

In [2]:
#First of all we need to load in the pandas library...
import pandas as pd

#...and the pandas remote data access support for calls to the World Bank Indicators API
from pandas.io import wb

Searching for Indicators

The easiest way to identify an indicator is to search for it by name using a keyword or key phrase.

In [46]:
wb.search('fertility rate')
Out[46]:
id name source sourceNote sourceOrganization topics
6554 SP.ADO.TFRT Adolescent fertility rate (births per 1,000 wo... World Development Indicators Adolescent fertility rate is the number of bir... b'United Nations Population Division, World Po... Social Development ; Health ; Gender
6594 SP.DYN.TFRT.IN Fertility rate, total (births per woman) World Development Indicators Total fertility rate represents the number of ... b'(1) United Nations Population Division. Worl... Gender ; Health
6595 SP.DYN.TFRT.Q1 Total fertility rate (TFR) (births per woman)... Health Nutrition and Population Statistics by ... Total fertility rate (TFR): The number of chil... b'Household Surveys (DHS, MICS)' Health
6596 SP.DYN.TFRT.Q2 Total fertility rate (TFR) (births per woman)... Health Nutrition and Population Statistics by ... Total fertility rate (TFR): The number of chil... b'Household Surveys (DHS, MICS)' Health
6597 SP.DYN.TFRT.Q3 Total fertility rate (TFR) (births per woman)... Health Nutrition and Population Statistics by ... Total fertility rate (TFR): The number of chil... b'Household Surveys (DHS, MICS)' Health
6598 SP.DYN.TFRT.Q4 Total fertility rate (TFR) (births per woman)... Health Nutrition and Population Statistics by ... Total fertility rate (TFR): The number of chil... b'Household Surveys (DHS, MICS)' Health
6599 SP.DYN.TFRT.Q5 Total fertility rate (TFR) (births per woman)... Health Nutrition and Population Statistics by ... Total fertility rate (TFR): The number of chil... b'Household Surveys (DHS, MICS)' Health
6602 SP.DYN.WFRT Wanted fertility rate (births per woman) World Development Indicators Wanted fertility rate is an estimate of what t... b'Demographic and Health Surveys by ICF Intern... Health ; Gender
6603 SP.DYN.WFRT.Q1 Total wanted fertility rate (births per woman)... Health Nutrition and Population Statistics by ... Total wanted fertility rate: Total wanted fert... b'Household Surveys (DHS, MICS)' Health
6604 SP.DYN.WFRT.Q2 Total wanted fertility rate (births per woman)... Health Nutrition and Population Statistics by ... Total wanted fertility rate: Total wanted fert... b'Household Surveys (DHS, MICS)' Health
6605 SP.DYN.WFRT.Q3 Total wanted fertility rate (births per woman)... Health Nutrition and Population Statistics by ... Total wanted fertility rate: Total wanted fert... b'Household Surveys (DHS, MICS)' Health
6606 SP.DYN.WFRT.Q4 Total wanted fertility rate (births per woman)... Health Nutrition and Population Statistics by ... Total wanted fertility rate: Total wanted fert... b'Household Surveys (DHS, MICS)' Health
6607 SP.DYN.WFRT.Q5 Total wanted fertility rate (births per woman)... Health Nutrition and Population Statistics by ... Total wanted fertility rate: Total wanted fert... b'Household Surveys (DHS, MICS)' Health
In [63]:
#We can also get a full list of indicators
indicators=wb.get_indicators()

#Preview first few rows of indicators list
indicators[:5]
Out[63]:
id name source sourceNote sourceOrganization topics
0 1.0.HCount.1.25usd Poverty Headcount ($1.25 a day) LAC Equity Lab The poverty headcount index measures the propo... b'LAC Equity Lab tabulations of SEDLAC (CEDLAS... Poverty
1 1.0.HCount.10usd Under Middle Class ($10 a day) Headcount LAC Equity Lab The poverty headcount index measures the propo... b'LAC Equity Lab tabulations of SEDLAC (CEDLAS... Poverty
2 1.0.HCount.2.5usd Poverty Headcount ($2.50 a day) LAC Equity Lab The poverty headcount index measures the propo... b'LAC Equity Lab tabulations of SEDLAC (CEDLAS... Poverty
3 1.0.HCount.Mid10to50 Middle Class ($10-50 a day) Headcount LAC Equity Lab The poverty headcount index measures the propo... b'LAC Equity Lab tabulations of SEDLAC (CEDLAS... Poverty
4 1.0.HCount.Ofcl Official Moderate Poverty Rate-National LAC Equity Lab The poverty headcount index measures the propo... b'LAC Equity Lab tabulations of data from Nati... Poverty

If you know the identifier - or part of the identifier - for a particular indicator, you can look up details for it directly. Use the * character as a wildcard character.

In [3]:
wb.search('gdp.*capita.*const')
Out[3]:
id name source sourceNote sourceOrganization topics
700 6.0.GDPpc_constant GDP per capita, PPP (constant 2011 internation... LAC Equity Lab GDP per capita based on purchasing power parit... b'NULWorld Development Indicators (World Bank)L' Economy & Growth
3496 GDPPCKD GDP per Capita, constant US$, millions GEP Economic Prospects GDP per capita is gross domestic product divid... b'World Bank staff calculations based on World... Economy & Growth
5530 NY.GDP.PCAP.KD GDP per capita (constant 2005 US$) World Development Indicators GDP per capita is gross domestic product divid... b'World Bank national accounts data, and OECD ... Economy & Growth
5532 NY.GDP.PCAP.KN GDP per capita (constant LCU) World Development Indicators GDP per capita is gross domestic product divid... b'World Bank national accounts data, and OECD ... Economy & Growth
5534 NY.GDP.PCAP.PP.KD GDP per capita, PPP (constant 2011 internation... World Development Indicators GDP per capita based on purchasing power parit... b'World Bank, International Comparison Program... Economy & Growth

Identifying Countries

When retrieving a dataset, we can specifiy which country, countries or regions we want the data for. The locations are identified using the appropriate ISO-2 code. To look up countries we can download the full country list.

In [65]:
#We can get a list of the countries and regions that indicator data may be available for
countries=wb.get_countries()

#Preview first few rows of countries list
countries[:5]
Out[65]:
adminregion capitalCity iso3c incomeLevel iso2c latitude lendingType longitude name region
0 Oranjestad ABW High income: nonOECD AW 12.5167 Not classified -70.0167 Aruba Latin America & Caribbean (all income levels)
1 South Asia Kabul AFG Low income AF 34.5228 IDA 69.1761 Afghanistan South Asia
2 AFR Aggregates A9 Aggregates Africa Aggregates
3 Sub-Saharan Africa (developing only) Luanda AGO Upper middle income AO -8.81155 IBRD 13.242 Angola Sub-Saharan Africa (all income levels)
4 Europe & Central Asia (developing only) Tirane ALB Upper middle income AL 41.3317 IBRD 19.8172 Albania Europe & Central Asia (all income levels)
In [69]:
#pandas dataframes allow us to search within the country list for a particular country
countries[ countries['name'] == 'Angola' ]
Out[69]:
adminregion capitalCity iso3c incomeLevel iso2c latitude lendingType longitude name region
3 Sub-Saharan Africa (developing only) Luanda AGO Upper middle income AO -8.81155 IBRD 13.242 Angola Sub-Saharan Africa (all income levels)

Download Data for a Particular Indicator

Once you have identified one or more indicators for which you would like to download a dataset, you need to identify the year or range of years, and the country, countries or regions (identified via their ISO-2 code) for which you would like the data.

In [5]:
#Download data from the World Bank API into a dataframe

df = wb.download(
                    #Use the indicator attribute to identify which indicator or indicators to download
                    indicator='NY.GDP.PCAP.KD',
                    #Use the country attribute to identify the countries you want data for
                    country=['US', 'CA', 'MX'],
                    #Identify the first year for which you want the data, as an integer or a string
                    start='2008',
                    #Identify the last year for which you want the data, as an integer or a string
                    end=2010
                )

#Show the dataframe 
df
Out[5]:
NY.GDP.PCAP.KD
country year
Canada 2010 36466.815112
2009 35671.659294
2008 37088.020368
Mexico 2010 8084.629000
2009 7788.271761
2008 8275.809458
United States 2010 43952.436548
2009 43234.451155
2008 44872.653626
In [60]:
#To download data for multiple indicators, specify them as a list
wb.download( indicator=['SP.DYN.TFRT.IN','NY.GDP.PCAP.KD'], country=['US','GB'], start=2008, end=2010 )
Out[60]:
SP.DYN.TFRT.IN NY.GDP.PCAP.KD
country year
United Kingdom 2010 1.920 37600.293399
2009 1.890 37277.481537
2008 1.910 39608.431481
United States 2010 1.931 43952.436548
2009 2.002 43234.451155
2008 2.072 44872.653626
In [19]:
#We can download data for a single year by setting the start and end dates to the same year
#To download data for a single country, you do not need to specify it as a list
df = wb.download( indicator='NY.GDP.PCAP.KD', country='US', start=2008, end=2008 )

#Show the dataframe 
df
Out[19]:
NY.GDP.PCAP.KD
country year
United States 2008 44872.653626
In [53]:
#To download the data for all countries, set the country attribute to 'all'
df = wb.download( indicator='SP.DYN.TFRT.IN', country='all', start=2010, end=2010 )

#Show a preview of the the first few rows of the dataframe 
df[:10]
Out[53]:
SP.DYN.TFRT.IN
country year
Andean Region 2010 NaN
Arab World 2010 3.297409
Caribbean small states 2010 2.224484
Central Europe and the Baltics 2010 1.445135
East Asia & Pacific (all income levels) 2010 1.817921
East Asia & Pacific (developing only) 2010 1.856321
East Asia and the Pacific (IFC classification) 2010 NaN
Euro area 2010 1.573820
Europe & Central Asia (all income levels) 2010 1.723861
Europe & Central Asia (developing only) 2010 1.979588

Notice that selecting all countries also pulls indicators back for different regional groupings as well as countries.

Summary

pandas support for remote data access makes it easy for us to get data from the World Bank Indicators API into a pandas dataframe, where we can start to work with it.